Never trust an LLM's output directly. Here's the validation layer I put on every agent.

A developer built a validation layer for LLM agent outputs that catches structural and semantic errors before code acts on the data. The three-stage pipeline—parse, validate, classify—uses Zod schemas to enforce type and semantic constraints, returning a discriminated union that forces callers to handle failure paths. The approach addresses common failure modes where models emit valid JSON that is structurally or semantically incorrect.

Here's a failure mode I've seen in nearly every AI agent codebase I've reviewed: the agent receives a model response, trusts the JSON it contains, and calls .result.items 0 .id — which throws Cannot read properties of undefined at 2 AM because the model returned {"result": null} on an edge case. The model didn't hallucinate the content. It hallucinated the structure . This is surprisingly common, and the fix isn't "use a better prompt." The fix is a validation layer that runs between the raw model output and the code that acts on it. Claude and GPT-4 both support structured output modes that constrain the model to emit valid JSON matching a given schema. This is genuinely useful and you should use it. But it doesn't fully solve the problem, for two reasons: 1. JSON-valid is not semantically valid. The model can emit perfectly valid JSON that conforms to your schema and still be wrong. A string field that should be a UUID might contain a made-up identifier that fails a database lookup. An integer field labeled confidence score might be 847 when your code expects a 0-1 float. The schema enforces types, not semantics. 2. Not all LLM calls use structured output. If you're doing multi-step reasoning, chain-of-thought steps, tool call parsing, or processing outputs from models that don't support native JSON mode, you're parsing free-text responses. You need to handle that robustly. Every agent call I build now goes through three stages: raw model output ↓ PARSE – extract the structure from the text ↓ VALIDATE – assert the structure matches expectations ↓ CLASSIFY – categorize the outcome so the caller can handle it Here's the TypeScript implementation I actually use: js import { z } from "zod"; // 1. Define the schema for what you expect const AnalysisResultSchema = z.object { sentiment: z.enum "positive", "negative", "neutral" , confidence: z.number .min 0 .max 1 , key points: z.array z.string .min 1 .max 10 , action required: z.boolean , follow up: z.string .optional , } ; type AnalysisResult = z.infer<typeof AnalysisResultSchema ; // 2. The parse-validate-classify wrapper type AgentOutput<T = | { ok: true; data: T } | { ok: false; reason: "parse failure" | "validation failure" | "empty response"; raw: string; error?: string }; function parseAgentOutput<T raw: string, schema: z.ZodSchema<T : AgentOutput<T { // Guard: empty or whitespace-only response if raw.trim { return { ok: false, reason: "empty response", raw }; } // Extract JSON from the response — models often wrap it in prose or code fences const jsonMatch = raw.match / {% endraw %} ?:json ?\s \s\S ? {% raw %} / || raw.match / \{ \s\S \}|\ \s\S \ / ; const jsonString = jsonMatch ? jsonMatch 1 ?? jsonMatch 0 : raw.trim ; let parsed: unknown; try { parsed = JSON.parse jsonString ; } catch err { return { ok: false, reason: "parse failure", raw, error: err instanceof Error ? err.message : "JSON.parse failed", }; } const result = schema.safeParse parsed ; if result.success { return { ok: false, reason: "validation failure", raw, error: result.error.errors.map e = ${e.path.join "." }: ${e.message} .join "; " , }; } return { ok: true, data: result.data }; } The AgentOutput<T discriminated union forces the caller to handle both the happy path and the failure paths. You can't accidentally access output.data without first checking output.ok . python import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic ; async function analyzeCustomerFeedback feedback: string : Promise<AgentOutput<AnalysisResult { const response = await client.messages.create { model: "claude-sonnet-4-5", max tokens: 512, system: You analyze customer feedback. Always respond with JSON matching this schema exactly: { "sentiment": "positive" | "negative" | "neutral", "confidence": number between 0 and 1, "key points": array of strings 1-10 items , "action required": boolean, "follow up": optional string } No prose. No markdown. Just the JSON object. , messages: { role: "user", content: feedback } , } ; const rawText = response.content .filter b : b is Anthropic.TextBlock = b.type === "text" .map b = b.text .join "" ; return parseAgentOutput rawText, AnalysisResultSchema ; } // Calling code handles both outcomes explicitly const result = await analyzeCustomerFeedback userFeedback ; if result.ok { // Log the failure with full context for debugging console.error "Agent output invalid", { reason: result.reason, error: result.error, raw: result.raw.slice 0, 500 , // don't log huge payloads } ; // Decide what to do: retry, fall back, surface to user, etc. return handleValidationFailure result.reason ; } // TypeScript knows result.data is AnalysisResult here const { sentiment, confidence, key points } = result.data; Not all validation failures are permanent. Sometimes the model produces malformed JSON on the first try but gets it right on a retry. The key is distinguishing which failures are worth retrying. async function analyzeWithRetry feedback: string, maxAttempts = 3 : Promise<AnalysisResult { let lastError = ""; for let attempt = 1; attempt <= maxAttempts; attempt++ { const result = await analyzeCustomerFeedback feedback ; if result.ok return result.data; lastError = result.error ?? result.reason; // Don't retry empty responses — something else is wrong if result.reason === "empty response" break; // On validation failure, give the model the error as feedback if attempt < maxAttempts && result.reason === "validation failure" { // Could pass the error back in the next prompt: "Your last response failed // validation: {lastError}. Try again." console.warn Attempt ${attempt} failed validation: ${lastError} ; continue; } } throw new Error Failed after ${maxAttempts} attempts. Last error: ${lastError} ; } The pattern of feeding the validation error back to the model in the retry prompt is particularly effective. Instead of blindly retrying, you're telling the model what went wrong. In my experience this gets you to a valid output on the second attempt about 80% of the time when the first attempt had a validation failure. When validation fails in production, you need enough information to understand and fix the problem — but not so much that you're logging personally identifiable information or burning storage costs. // Good: structured, queryable, safe console.error JSON.stringify { event: "agent validation failure", reason: result.reason, error path: result.error, // which field failed response length: result.raw.length, response prefix: result.raw.slice 0, 100 , // enough to see the pattern model: "claude-sonnet-4-5", timestamp: new Date .toISOString , } ; After a week of production logs, you'll see patterns. Maybe the model consistently omits the confidence field for certain categories of input. Maybe it returns arrays as strings when the input contains newlines. Those patterns tell you where to strengthen your prompt or add extra coercion logic. If Zod feels like overkill, here's the minimal version that still catches the most common failures: python import json from typing import TypedDict class AnalysisResult TypedDict : sentiment: str confidence: float action required: bool REQUIRED KEYS = {"sentiment", "confidence", "action required"} VALID SENTIMENTS = {"positive", "negative", "neutral"} def parse analysis raw: str - AnalysisResult | None: Strip code fences if present text = raw.strip if text.startswith " " : text = text.split " " 1 if text.startswith "json" : text = text 4: try: data = json.loads text.strip except json.JSONDecodeError: return None Check required keys if not REQUIRED KEYS.issubset data.keys : return None Check semantic constraints if data "sentiment" not in VALID SENTIMENTS: return None if not 0 <= float data "confidence" <= 1 : return None return data Not as composable as Zod, but it catches the common failure modes: missing keys, wrong enum values, out-of-range numbers. LLMs are probabilistic. They do not guarantee that their structured output will be valid — even when you ask nicely. A production agent needs a deterministic layer that classifies every output as valid or invalid before any code acts on it. Build that layer first, log its failures, and let the failure data tell you where your prompt needs to improve. The validation layer doesn't slow you down — it makes your agent debuggable. Without it, you're flying blind. I cover validation patterns, retry logic, and production reliability in the free Reliable Agent Field Guide : penloomstudio.com/field-guide.html https://penloomstudio.com/field-guide.html