# Auditing Kubernetes Manifests With AI: A Practical Workflow

> Source: <https://dev.to/devopsaitoolkit/auditing-kubernetes-manifests-with-ai-a-practical-workflow-4368>
> Published: 2026-06-16 04:31:15+00:00

A senior K8s engineer I work with audits manifests faster than I read them. He's seen so many patterns that "missing readinessProbe on a Deployment that takes 45 seconds to start" jumps off the page. Most of us don't have that pattern library memorized — and increasingly, we don't need to. AI assistants have read more Kubernetes manifests than any human ever will.

The catch: a generic "review this YAML" prompt produces generic noise. You need to direct the model toward the categories of issues that actually matter in your environment.

**Mistake 1: Asking for "a security review."** You'll get a bullet list of every possible concern, ranked alphabetically, with no signal about which matter. You'll skim, dismiss, and learn nothing.

**Mistake 2: Pasting one manifest.** Real Kubernetes problems live in the interaction between resources — a Deployment's readiness probe and a Service's selector, a NetworkPolicy and the actual app traffic. One YAML in isolation hides most of the bugs.

The fix for both is the same: give the model a *bounded scope* and *enough context* to reason about interactions.

Pre-decide what you're checking *for*. Different prompts for different dimensions:

Mixing dimensions in one review produces wishy-washy output. Pick one, get a clean answer, move on.

For a workload review, paste:

For YAML this is usually under 500 lines, well within any model's context window. The model can now reason about interactions, not just isolated fields.

The big difference between "tell me about this YAML" and a useful review is *the instruction format*. Compare:

Review this Kubernetes manifest.

versus:

You are reviewing a production Deployment + Service + NetworkPolicy bundle. For each finding, give: (1) severity (critical/high/medium/low), (2) the exact field path that's wrong, (3) one sentence on why it matters, (4) the corrected YAML snippet. Focus only on probes, lifecycle, and graceful shutdown. Ignore documentation/comments.

The first prompt produces an essay. The second produces a list of fixable issues.

This is where most reviews go wrong. The model is right *most of the time*. It's wrong some of the time, often in ways that look correct.

Common AI failure modes in K8s review:

`spec.template.spec.terminationGracePeriod`

(it's `terminationGracePeriodSeconds`

)`policy/v1beta1 PodDisruptionBudget`

(removed in 1.25)`failureThreshold`

defaults to 1 when it's 3`runAsNonRoot: true`

for a workload that legitimately needs rootFor every "fix" the model suggests, glance at the official K8s docs for that field. This adds 30 seconds per finding and catches the wrong ones. Without this step, you will apply changes that break things.

Here's a Deployment I reviewed last week:

```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payments
spec:
  replicas: 2
  selector:
    matchLabels: { app: payments }
  template:
    metadata:
      labels: { app: payments }
    spec:
      containers:
      - name: app
        image: registry.example.com/payments:v3.1.0
        ports:
        - containerPort: 8080
        env:
        - name: DB_URL
          value: postgres://payments-db:5432/payments
        resources:
          limits:
            cpu: "2"
            memory: "2Gi"
        readinessProbe:
          httpGet: { path: /healthz, port: 8080 }
          initialDelaySeconds: 5
```

I asked Claude to review for probes and graceful shutdown only. The findings:

`requests`

, only `limits`

`BestEffort`

QoS, first to be evicted under pressure. Set requests equal to or below limits.`initialDelaySeconds: 5`

`startupProbe`

with longer threshold.`livenessProbe`

`terminationGracePeriodSeconds`

`preStop`

hook`sleep 15`

preStop.All five were real, all five were fixable in two minutes of YAML editing. The model didn't tell me about anything irrelevant. That's because I scoped the prompt to "probes and graceful shutdown only."

The big one — #5 — is something I've personally been bitten by twice. The model wouldn't have prioritized it without the directive prompt.

Yes, you should run those too. They catch consistent issues at admission time. They don't catch issues that require *judgment*: "is 30 seconds enough graceful shutdown for this specific service?" Policy enforcement is a floor; AI review is a directed second opinion above that floor.

I run both. Kyverno catches "no securityContext at all" before it ever lands. AI review catches "readinessProbe path doesn't match what the app exposes" — something only a human (or an AI imitating one) would notice.

If you want a template, here's the one I use most:

You are reviewing a Kubernetes workload bundle for production readiness. Focus only on: probes (readiness, liveness, startup),

`terminationGracePeriodSeconds`

, preStop hooks, and rolling update strategy. For each finding produce: severity, exact field path, why it matters in one sentence, corrected YAML. Ignore everything else (security context, network policies, resource limits — those are separate reviews). The workload is [serves HTTP at /api on port 8080 / consumes from a queue / batch processor that runs N hours].

The bracketed context at the end is what makes the review accurate for *your* workload. Without it, the model assumes a generic web service.

For our full prompt library on Kubernetes review, see the [Kubernetes & Helm category](https://dev.to/categories/kubernetes-helm/) — especially [kubernetes-yaml-security-review](https://dev.to/prompts/kubernetes-yaml-security-review/) and [kubernetes-resource-limits-tuning](https://dev.to/prompts/kubernetes-resource-limits-tuning/).

*This article was originally published on DevOps AI ToolKit — practical AI workflows for cloud engineers.*
