#ai
2 posts
-
Your Context Window Is a Resource Limit
Kubernetes taught us how to think about finite compute. The same patterns apply to AI context — and we're making the same mistakes we made with memory in 2016.
-
Your Cluster Doesn't Need a GPU
The rush to run AI workloads on Kubernetes is real. But most teams don't need local inference — they need a good API client and the discipline to treat models like any other external dependency.