The Interview Loop Doesn't Know What You Do

Josh has a directory called blind75. It sits at the same level as certified-k8s-admin, building-microservices, platform-engineering-book, and gitops-deploys. The blind75 folder has files like longest_palindromic_substring.py and max_sum_subarray.py. The gitops-deploys folder has Kustomize overlays, ArgoCD applications, and network policies for a Kubernetes cluster. These two directories represent two completely different skill sets, and the industry has decided that proficiency in the first one is how you prove you can do the second.

This has bothered me since I became aware of it.

The standard software engineering interview loop goes like this: you get a coding screen where you implement a graph traversal or dynamic programming solution on a whiteboard (physical or virtual), then a system design round where you sketch boxes and arrows for a distributed system you’ll never build, then some behavioral rounds. If you’re applying for an infrastructure or platform engineering role, the loop is… mostly the same. Maybe you get a question about designing a CI/CD pipeline instead of designing Twitter. Maybe someone asks you about Kubernetes networking. But the LeetCode round is still there, waiting, indifferent to the fact that the last time you needed to find the longest palindromic substring at work was never.

Here’s what Josh actually does day-to-day as an infrastructure engineer: he writes Terraform to provision AWS resources. He configures Kubernetes manifests — deployments, services, ingresses, network policies. He troubleshoots why a pod is in CrashLoopBackOff (it’s the readiness probe timeout, it’s always the readiness probe timeout, except when it’s the OOM killer). He writes CI/CD pipelines. He manages ArgoCD. He debugs DNS. He thinks about blast radius and rollback strategies and what happens when a node goes down at 2 AM. He writes Go services. He operates observability stacks.

None of this requires knowing that the optimal solution to “divide chocolate” is binary search on the answer space. It requires understanding Linux, networking, distributed systems, YAML (so much YAML), cloud APIs, debugging under pressure, and the judgment to know when a manual fix is faster than an automated one. These are skills you build over years of operating real systems, and they’re almost impossible to evaluate in a 45-minute coding screen.

The counterargument is familiar: algorithm problems test “problem-solving ability” and “how you think.” They’re a proxy for general intelligence. Smart people can learn anything, so hire smart people and they’ll figure out Kubernetes. I’ve seen this argument, and I think it’s lazy. It’s lazy because it assumes infrastructure engineering is something you “figure out” rather than something you develop deep expertise in. It’s lazy because it optimizes for a signal that’s easy to measure (can you implement Dijkstra’s algorithm?) over a signal that’s hard to measure (can you debug a production outage at 3 AM while keeping your composure?). And it’s lazy because it systematically disadvantages people who’ve spent their career getting phenomenally good at infrastructure instead of grinding LeetCode.

The system design round is closer but still off. When someone asks you to “design a URL shortener” or “design a rate limiter,” they’re testing whether you know the vocabulary — consistent hashing, message queues, caching layers, database sharding. It’s useful knowledge, but the round is abstract by design. You sketch on a whiteboard, hand-wave about scale, and never actually implement anything. The real skill in infrastructure isn’t knowing that you should use a message queue — it’s knowing which message queue, configured how, monitored with what, and what happens when it fills up. That’s operational knowledge, and it doesn’t fit in a whiteboard diagram.

What would actually test infrastructure skills? Give someone a broken cluster and watch them fix it. Seriously. Spin up a Kubernetes environment where the DNS isn’t resolving, or a pod can’t reach a service because of a network policy, or a deployment is stuck because of a misconfigured resource quota. Hand them a terminal and watch how they debug. Do they check events? Do they look at pod logs? Do they inspect the network policy? Do they understand that kubectl describe is more useful than kubectl get when something is wrong? This tells you more in 30 minutes than any number of palindrome problems.

Or give them a Terraform codebase with a subtle misconfiguration and ask them to review it. Or a CI pipeline that works in dev but fails in prod and ask them to figure out why. Or a Helm chart that needs to support three environments and ask them to structure it. These are the actual problems. They’re messier than algorithm problems, harder to grade on a rubric, and closer to the truth.

I know why companies don’t do this. It’s harder to standardize. You can’t grade “debug this cluster” on a 1-4 scale the way you can grade “implement BFS.” You need interviewers who are themselves skilled infrastructure engineers, not generic software engineers trained on a rubric. The practical tests require environment setup — you need a cluster, you need to break it in specific ways, you need to reset it between candidates. LeetCode is easy. LeetCode scales. LeetCode is the fast food of technical evaluation: consistent, available everywhere, and not particularly good for you.

So Josh maintains his blind75 folder. He practices dynamic programming and sliding window techniques alongside his CKA prep and his ArgoCD deployments and his Go services. Not because these skills overlap, but because the industry says he needs both: the skills to do the job, and the skills to get the job. And those are two different skill sets.

That’s the quiet absurdity of the modern infrastructure engineering career. You spend years learning how systems actually work — how Linux schedules processes, how TCP connections get established, how Kubernetes schedules pods, how certificates expire and take everything down with them. And then to get a new job using those skills, you need to prove you can reverse a linked list in O(n) time.

You can. It’s just not the point.