I skipped the LLM and built a 9-rule deterministic diagnosis engine for my performance monitoring SaaS A developer built CacheSnap, a performance monitoring SaaS that probes URLs from 8 AWS Lambda regions and uses a 9-rule deterministic diagnosis engine instead of an LLM to identify CDN caching issues. The engine provides actionable fixes in sub-millisecond time, avoiding the latency, correctness, and testability problems of LLM-based approaches. Most developers don't know if their CDN is actually caching in Tokyo. They check the dashboard, see a green dot, assume everything is fine. Meanwhile, every request from Asia is hitting their origin in Frankfurt because a CDN config never propagated. TTFB is 800ms. Users are leaving. Nobody noticed because the uptime monitor only checks "is the site up?", not "is it fast from where your users actually are?" That's what I built CacheSnap https://cachesnap.com to fix. It probes your URLs from 8 AWS Lambda regions every few minutes and, instead of showing you raw headers, tells you what's wrong and what to do about it. This post is about the two pieces of engineering that make that work: the deterministic diagnosis engine and the Redis-gated scheduler. Before getting into implementation, it helps to understand what we're targeting. When CacheSnap detects a problem, a card like this appears on the dashboard: ⚠ CRITICAL · sa-east São Paulo Cache MISS: origin server is being consulted for every request in sa-east. Action: Add Cache-Control: public, s-maxage=300 to your response headers. For Next.js, use export const revalidate = 300 in your page. Estimated gain: ~450ms No headers to decode. No raw JSON to interpret. The cause is a sentence. The fix is two lines of config. The gain is a number. Getting from raw probe data to that card is the job of the diagnosis engine. A probe returns something like this: TTFB: 480ms Cache-Status: MISS HTTP: HTTP/2 Redirects: 0 Served-By: 87c1d4a2b3c4d5e6-IAD ← Cloudflare CF-Ray header That's a measurement. It tells you what happened, not why or what to do . The gap between "480ms TTFB" and "your CDN isn't caching: here's the exact config line to fix it" is where most monitoring tools stop. The obvious path is to feed the data to an LLM and let it generate the explanation. I spent a week thinking seriously about this and decided against it. Three reasons: 1. Volume and latency. Diagnosis runs on every probe ingest. With 50 monitors × 8 regions × 1-minute intervals, that's 400 diagnosis calls per minute at steady state. An LLM call averaging 800ms would add more latency to the pipeline than the performance problems it's diagnosing. Diagnosis needs to be sub-millisecond. 2. Correctness. An LLM will generate plausible advice regardless of whether it's applicable. It might say "try adding a Cache-Control header" when one already exists and the problem is a CDN misconfiguration. A rule engine is wrong in known, fixable ways: you can write a test for every mistake it makes. 3. Testability. I want diagnose input to be a pure function with deterministic output I can run in CI. The priority between rules is a product decision: "cache MISS beats anycast mismatch" is something I can assert and lock down. With an LLM that's not possible. The alternative: a priority-ordered rule table. Each rule maps an observable condition to a structured diagnosis. Rules evaluate top-to-bottom; first match wins. The core types: pub struct DiagnosisInput { pub ttfb ms: Option