cd /news/cloud-computing/the-disk-pressure-incident-that-taug… · home topics cloud-computing article
[ARTICLE · art-8936] src=dev.to ↗ pub= topic=cloud-computing verified=true sentiment=· neutral

The Disk-Pressure Incident That Taught Me to Always Set LimitRanges and Other Lessons from Mirroring EKS Locally.

The article describes the process of mirroring a production EKS cluster locally on a Mac, focusing on integrating Vault for secret management. It explains how Vault's Kubernetes auth method allows pods to authenticate using service account JWTs, enabling secure retrieval of credentials without hard-coded secrets. The author emphasizes the importance of setting up LimitRanges and testing the full injection workflow locally to debug failures safely.

read7 min views3 publishedMay 22, 2026

Part 6 of 7 — The Mac Kubernetes Lab: A Production-Mirror Setup from Scratch. Previously in Part 5: We installed Istio with revision-based upgrades, MetalLB for LoadBalancer IPs, and practised traffic management with Gateways, VirtualServices, and fault injection. The cluster behaves. Now we wire up the last three pieces that turn it from “a working local cluster” into “a real mirror of our production EKS.” The cluster works. Istio is running. MetalLB is handing out IPs. But it’s still missing three layers that make the production parity actually meaningful: The LimitRange story is the most important of the three, so I’ll tell it properly when we get there. First, the auth layer. Vault’s Kubernetes auth method lets pods authenticate by presenting their service account JWT. Vault validates the token against the Kubernetes API server and exchanges it for a Vault token with the appropriate policies attached. On the production EKS clusters at work, this is how microservices retrieve database credentials, API keys, and TLS certificates: no hard-coded secrets, no secret sprawl, every issuance audit-logged in Vault. Setting it up locally means I can test the full injection workflow without a VPN, and debug failures on a cluster where the stakes are zero. We deploy just the Vault agent injector in the lab cluster. It points to the external Vault VM rather than running its own Vault server: kubectx lab-cluster helm repo add hashicorp https://helm.releases.hashicorp.com helm repo update export VAULT_IP=$(orb run -m vault hostname -I | awk '{print $1}') echo "VAULT_IP=$VAULT_IP" helm install vault hashicorp/vault \

--namespace vault --create-namespace \
--set "injector.externalVaultAddr=http://$VAULT_IP:8200"

kubectl get pods -n vault Run this on the vault VM, pointing Vault at the lab cluster’s API server:

export VAULT_ADDR='http://127.0.0.1:8200'
export VAULT_ROOT_TOKEN=$(grep 'Initial Root Token' ~/vault-init.txt | awk '{print $NF}')

vault login $VAULT_ROOT_TOKEN

export CP_IP=<cp01-ip>
vault read -field=certificate pki_k8s/issuer/default > /tmp/lab-ca.crt

vault auth enable -path=lab-k8s kubernetes 2>/dev/null || echo "already enabled" vault write auth/lab-k8s/config \

kubernetes_host="https://$CP_IP:6443" \
kubernetes_ca_cert=@/tmp/lab-ca.crt

vault read auth/lab-k8s/config Create a simple role and test it from a pod: vault policy write read-secrets - <<EOF path "secret/data/myapp/*" {

capabilities = ["read"]
}

EOF vault write auth/lab-k8s/role/myapp
bound_service_account_names=myapp
bound_service_account_namespaces=default
policies=read-secrets
ttl=1h vault secrets enable -path=secret kv-v2 2>/dev/null || true vault kv put secret/myapp/config db_password="supersecret" kubectl apply -f - <<EOF apiVersion: v1 kind: ServiceAccount metadata: name: myapp namespace: default #

apiVersion: v1 kind: Pod metadata: name: vault-test namespace: default annotations: vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/role: "myapp" vault.hashicorp.com/agent-inject-secret-config: "secret/data/myapp/config" spec: serviceAccountName: myapp containers:

  • name: app image: busybox command: ["sleep", "3600"] EOF
kubectl exec vault-test -c app -- cat /vault/secrets/config
If that last line returns the password, the whole chain works: service account JWT → Vault validation → Vault token → secret retrieval → file injection. Every link of the chain is what a real production app does.

Crossplane turns a Kubernetes cluster into a universal control plane for cloud infrastructure. Instead of Terraform modules or CloudFormation stacks, you define infrastructure as Kubernetes custom resources, and Crossplane reconciles them continuously. I use it at work to provision AWS resources (EKS node groups, RDS, S3 buckets, IAM roles) and VMware Cloud Director resources through a custom provider. The lab version mirrors the AWS side of that. helm repo add crossplane-stable https://charts.crossplane.io/stable helm repo update helm install crossplane crossplane-stable/crossplane \

--namespace crossplane-system --create-namespace
kubectl get pods -n crossplane-system -w
kubectl apply -f - <<EOF

apiVersion: pkg.crossplane.io/v1 kind: Provider metadata: name: provider-aws-ec2 spec: package: xpkg.upbound.io/upbound/provider-aws-ec2:latest EOF kubectl get pkg A bare-minimum ProviderConfig enough to verify the install is working: kubectl apply -f - <<EOF apiVersion: aws.upbound.io/v1beta1 kind: ProviderConfig metadata: name: default spec: credentials: source: Secret secretRef:

namespace: crossplane-system
name: aws-creds

key: creds EOF In a real setup, you create a IRSA ( IAM Role for Service Account) to authenticate and give the provider permission to create and monitor resources. For local validation, the provider installs, and the compositions can be validated structurally without ever calling AWS. The LimitRange story. This is the one that came from a real incident at work. We had repeated disk-pressure events in our production EKS cluster. Pods with no resource requests had crept into a few namespaces — someone deployed a YAML that omitted resources: entirely, and nobody caught it in review. The Kubernetes scheduler had no signal about their consumption, so nodes ended up overcommitted. Then ephemeral storage filled up, eviction kicked in, and a couple of unrelated pods went down with it. Total downtime measured in tens of minutes. Cause-and-effect chain that took a while to untangle. The fix is one of the most boring features in Kubernetes: LimitRanges. They set default resource requests and limits at the namespace level. Any container that doesn’t specify its own requests gets the defaults applied automatically by the admission controller. The scheduler always has a signal. Overcommit becomes a deliberate choice, not an accident. kubectl apply -f - <<EOF apiVersion: v1 kind: LimitRange metadata: name: default-limits namespace: default spec: limits:

- default:
memory: 512Mi
cpu: 500m

defaultRequest:

memory: 128Mi
cpu: 100m

max: ephemeral-storage: 2Gi type: Container EOF Apply this to every namespace that hosts workloads. In production, I now apply it as a post-provisioning step on every new namespace:

for ns in default vault crossplane-system istio-system; do
kubectl apply -f limitrange.yaml -n $ns

done ephemeral-storage max is the part that specifically addresses the disk-pressure failure mode — it bounds how much scratch space a container can consume, which is what spirals when ephemeral storage runs unbounded. Let’s confirm the whole stack is up: kubectl get nodes -o wide kubectl get pods -A The only meaningful differences are the CNI (because of OrbStack’s VM capabilities, as we covered in Part 4) and the LoadBalancer implementation. Everything else is identical in configuration. The mental model from this lab transfers directly to the production cluster, and vice versa. In the final article: How to stop and start the lab without losing state, the CKS exam scenarios this cluster was purpose-built for, and the shell aliases that make the whole thing pleasant to live with. ← Part 5: How I Practise Istio Upgrades Locally Before Touching Production EKS | Part 7: The Day 2 Reality of Running a Kubernetes Lab on Your Mac: Stop/Start, CKS Scenarios, and What I Learned Building It → I’m Noah Makau, a DevSecOps engineer based in Nairobi. I run a small DevOps consultancy and hold CKA, CKAD, and the AWS Solutions Architect Professional certifications , currently preparing for CKS. I write about Kubernetes, Vault, Crossplane, and the day-to-day of running platforms that actually have to stay up. originally published at blog.arkilasystems.com

── more in #cloud-computing 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/the-disk-pressure-in…] indexed:0 read:7min 2026-05-22 ·