Your AI Agent Drifted Last Night and You Didn't Notice An engineer has identified three distinct patterns of "agent drift"—gradual degradation in AI agent output quality that occurs without hard failures or schema violations, often going unnoticed for days until a customer complains. The developer argues that the hardest production failures are not crashes but quality degradation between eval checkpoints, requiring continuous runtime detection rather than just pre-deployment testing. After running production agents 24/7, the engineer documented drift patterns including stale context from outdated knowledge bases and API caches, as well as behavioral shifts from silent model updates or prompt injection attempts. Your agent passed every test in CI. It ran fine in staging. Then it quietly started returning subtly wrong answers in production at 2 AM, and nobody noticed until a customer complained three days later. This is agent drift — the gradual degradation of agent output quality without any hard failures. No exceptions thrown. No schema violations. Just slowly worsening responses that slip past your monitoring. Here's my thesis: the hardest production failures aren't crashes — they're quality degradation that happens between your eval checkpoints. You need continuous runtime detection, not just pre-deployment testing. After running production agents 24/7, I've identified three distinct drift patterns: Your agent retrieves documents, knowledge bases, or API responses as context. That context decays: interface StalenessDetector { source: string; maxAgeMs: number; check: context: RetrievedContext = StalenessResult; } const stalenessChecks: StalenessDetector = { source: 'knowledge-base', maxAgeMs: 24 60 60 1000, // 24 hours check: ctx = { const age = Date.now - ctx.lastIndexedAt; const staleChunks = ctx.chunks.filter c = Date.now - c.sourceLastModified 7 24 60 60 1000 ; return { stale: staleChunks.length / ctx.chunks.length 0.3, staleFraction: staleChunks.length / ctx.chunks.length, oldestChunkAge: Math.max ...ctx.chunks.map c = Date.now - c.sourceLastModified , recommendation: staleChunks.length 0 ? ${staleChunks.length} chunks older than 7 days : 'fresh' }; } }, { source: 'api-response-cache', maxAgeMs: 60 60 1000, // 1 hour check: ctx = { const cached = ctx.apiResponses.filter r = r.fromCache ; const expired = cached.filter r = Date.now - r.cachedAt 60 60 1000 ; return { stale: expired.length 0, staleFraction: expired.length / Math.max cached.length, 1 , recommendation: expired.length 0 ? ${expired.length} cached API responses expired : 'fresh' }; } } ; The insidious part: stale context doesn't cause errors. Your agent happily generates confident answers based on outdated information. The output looks fine — it's just wrong. The agent's response patterns shift over time. Maybe the underlying model got a silent update. Maybe prompt injection attempts are subtly reshaping behavior. Maybe token distributions are shifting due to accumulated conversation context. interface DriftBaseline { dimension: string; expectedDistribution: { mean: number; stddev: number }; windowSize: number; } class BehavioralDriftDetector { private baselines: Map