What Happens in 2 Milliseconds: Anatomy of a Single HTTP Request Through a Production WAF

A production web application firewall (WAF) processing over 100,000 daily requests across 50 client websites uses a six-stage pipeline that prioritizes cheap checks first and expensive ones last, with IP reputation running before the rule engine to avoid unnecessary regex matches. The system's IP reputation module relies on three in-memory O(1) data structures, including a scores map with a 24-hour half-life decay that stabilizes at 40,000-50,000 entries through hourly eviction of low-score IPs. The developer noted that using MaxMind GeoLite for ASN lookups from the start would have avoided several weeks of noisier detection caused by manually maintained CIDR lists.

The rule engine is not the hard part. Everyone builds a rule engine. The hard part is deciding what order the checks run in — because the difference between a hash map lookup and a regex match is two orders of magnitude, and you're doing this on every single request. Six-stage pipeline. Production. 50+ client websites, 100K+ daily requests. I'll trace one request through all of it. http POST /api/login HTTP/1.1 Host: client-website.com User-Agent: python-requests/2.28.0 Content-Type: application/json X-Forwarded-For: 185.220.101.45 {"username":"admin' OR '1'='1' --","password":"anything"} Four problems: Tor exit node IP, automation library User-Agent, no Accept header, SQL injection payload. It gets blocked at stage 4. But all six stages matter. func waf WAF Handle next http.Handler http.Handler { return http.HandlerFunc func w http.ResponseWriter, r http.Request { ctx := &RequestContext{ IP: extractIP r , Start: time.Now , } // Stage 1: IP reputation — cheapest check, runs first ipScore := waf.reputation.Score ctx.IP ctx.Score += ipScore if ctx.Score = 100 { waf.block w, r, ctx, Decision{Code: 403, Reason: "blocklist"} return } // Stage 2: Rate limiting if allowed := waf.limiter.Allow ctx.IP ; allowed { ctx.Score += 25 ctx.RateLimited = true } // Stage 3: Header inspection headerScore, hardBlock := waf.inspectHeaders r ctx.Score += headerScore if hardBlock = "" { waf.block w, r, ctx, Decision{Code: 400, Reason: hardBlock} return } // Stage 4: Rule engine — most expensive, runs last body, := io.ReadAll r.Body r.Body = io.NopCloser bytes.NewReader body ctx.Matches = waf.rules.Evaluate r, body // Stage 5: Decision if d := waf.decide ctx ; d.Block { waf.block w, r, ctx, d return } next.ServeHTTP w, r } } Cheap checks first, expensive last. If IP reputation kills the request, the rule engine never runs. At 100K req/day that ordering shows up measurably in CPU. Three data structures, all in memory, all O 1 : type IPReputation struct { mu sync.RWMutex blocklist map string struct{} torExits map string struct{} cidrs net.IPNet // datacenter/hosting ranges scores map string scoredIP } type scoredIP struct { score int lastSeen time.Time } func ipr IPReputation Score ip string int { ipr.mu.RLock defer ipr.mu.RUnlock if , ok := ipr.blocklist ip ; ok { return 100 } if , ok := ipr.torExits ip ; ok { return 70 } parsed := net.ParseIP ip for , cidr := range ipr.cidrs { if cidr.Contains parsed { return 55 // datacenter origin — not necessarily malicious, but not a browser } } if entry, ok := ipr.scores ip ; ok { days := time.Since entry.lastSeen .Hours / 24 // Score halves every 24h return int float64 entry.score math.Pow 0.5, days } return 0 } Score decays at a 24-hour half-life. IPs rotate. Cloud provider ranges get reassigned. Treating a month-old signal the same as a current one tanks precision — false positives climb until the system is more noise than signal. We should have used MaxMind GeoLite from day one instead of maintaining a CIDR list manually. We added hosting ranges reactively — after seeing attacks rather than before — and missed several in the first few months. Proper ASN lookups would have caught those automatically. That gap cost us a few weeks of noisier detection early on. The scores map has an unbounded growth problem. A background goroutine evicts entries with decayed scores below 5, running every hour. In production the map stabilized around 40-50k entries. 185.220.101.45 matches the Tor exit list. Score: 70. Continue. Sliding window, not fixed. Fixed windows have a boundary exploit: 59 requests at :59, 59 more at :00 — 118 requests through a 60-request limit. The sliding window always covers the last N seconds. There's no boundary to game. type SlidingWindow struct { mu sync.Mutex entries map string ipWindow limit int window time.Duration } type ipWindow struct { timestamps int64 // nanoseconds — 8 bytes vs 24 for time.Time } func sw SlidingWindow Allow ip string bool { sw.mu.Lock defer sw.mu.Unlock now := time.Now .UnixNano cutoff := now - sw.window.Nanoseconds w := sw.entries ip if w == nil { w = &ipWindow{} sw.entries ip = w } // Prune in-place — avoids allocating a new slice on every call n := 0 for , t := range w.timestamps { if t cutoff { w.timestamps n = t n++ } } w.timestamps = w.timestamps :n if len w.timestamps = sw.limit { return false } w.timestamps = append w.timestamps, now return true } Threshold: 60 requests per 10-second window. This attacker sent 847. Rate limit alone doesn't block. +25 to score, continue. A misconfigured load balancer looks identical to a rate violation — same IP, high request count. The system needs the full picture before making a hard call. Rate limit plus anything else usually crosses the block threshold. Score: 70 + 25 = 95. Continue. Real browsers are consistent. They send Accept , Accept-Language , Accept-Encoding . Their User-Agent follows recognizable patterns. Automation libraries don't replicate this — not because attackers are careless, but because python-requests , httpx , go-http-client don't send browser headers by default, and most attackers don't bother faking them. js var automationSignatures = string{ "python-requests", "python-urllib", "go-http-client", "libwww-perl", "java/", "curl/", "wget/", "sqlmap", "nikto", "masscan", "zgrab", "scrapy", "aiohttp", "httpx", "mechanize", } func waf WAF inspectHeaders r http.Request score int, hardBlock string { ua := r.Header.Get "User-Agent" if ua == "" { return 40, "" } uaLow := strings.ToLower ua for , sig := range automationSignatures { if strings.Contains uaLow, sig { score += 30 break } } if r.Header.Get "Accept" == "" { score += 15 } // POST from a browser almost always carries a Referer if r.Method == http.MethodPost && r.Header.Get "Referer" == "" { score += 10 } // Header injection is a hard block regardless of score for , values := range r.Header { for , v := range values { if strings.ContainsAny v, "\r\n" { return 0, "header injection" } } } return score, "" } Header injection is the only hard block at this stage. \r\n in a header value is never legitimate — it can split HTTP responses and poison downstream caches. Everything else is scored and accumulated. We evaluated TLS fingerprinting JA3 — comparing cipher suite and extension order from the TLS handshake, which browsers expose consistently and scripts don't. Decided against it. It requires TLS termination at the WAF layer or integration with nginx's ssl fingerprint module, and it's brittle across library versions. The coupling cost wasn't worth it at our traffic volume. Worth revisiting at scale. python-requests/2.28.0: +30. No Accept: +15. No Referer on POST: +10. Score: 95 + 55 = 100 capped . Continue. Most expensive stage. Runs last. Pre-compile at startup. regexp.MustCompile is not free. Calling it per request at 100K req/day is burning CPU for no reason. All patterns compile once on server start, stored as regexp.Regexp struct fields, reused across every request. Normalize before matching. Attackers don't send raw OR '1'='1' . They URL-encode it, double-encode it, or split it across fields. A rule engine that only looks at the raw payload misses most real attacks. func normalize input byte byte { // First pass s, err := url.QueryUnescape string input if err = nil { s = string input } // Second pass — catches double-encoding s2, err := url.QueryUnescape s if err = nil { s2 = s } return byte strings.ToLower s2 } Then the rules: type Rule struct { ID string Pattern regexp.Regexp Severity int // 1–4; severity 4 = block unconditionally regardless of score Target Target // Body, URL, or both } // Compiled at init — never at request time var coreRules = Rule{ { ID: "SQLI-001", Pattern: regexp.MustCompile \bor\b\s+ '" ?\w+ '" ?\s =\s '" ?\w+ '" ? , Severity: 4, Target: TargetBody, }, { ID: "SQLI-002", Pattern: regexp.MustCompile --| |/\ , Severity: 3, Target: TargetBody, }, { ID: "SQLI-003", Pattern: regexp.MustCompile \bunion\b.{0,30}\bselect\b , Severity: 4, Target: TargetBody | TargetURL, }, { ID: "XSS-001", Pattern: regexp.MustCompile <script \s/ |javascript\s : , Severity: 4, Target: TargetBody | TargetURL, }, { ID: "PATH-001", Pattern: regexp.MustCompile \.\. \\/ {2,} , Severity: 3, Target: TargetURL, }, { ID: "CMD-001", Pattern: regexp.MustCompile ;|& \s cat|ls|whoami|id|wget|curl \b , Severity: 4, Target: TargetBody | TargetURL, }, } func e RuleEngine Evaluate r http.Request, body byte Match { normBody := normalize body normURL := normalize byte r.URL.RawQuery + r.URL.Path var matches Match for , rule := range e.rules { var target byte if rule.Target&TargetBody = 0 { target = normBody } else { target = normURL } if loc := rule.Pattern.Find target ; loc = nil { matches = append matches, &Match{Rule: rule, At: loc} } } return matches } After normalization, the body reads as: {"username":"admin' or '1'='1' --","password":"anything"} . SQLI-001 fires on or '1'='1' . SQLI-002 fires on -- . Two matches. SQLI-001 is severity 4. Score is irrelevant — block unconditionally. Thin layer. Accumulated context in, decision out. Complexity here is where subtle edge cases live and where probing exploits get found. func waf WAF decide ctx RequestContext Decision { // Severity-4 match: score doesn't matter for , m := range ctx.Matches { if m.Rule.Severity == 4 { return Decision{Block: true, Code: 403, Reason: m.Rule.ID} } } // High score + any rule match: block if ctx.Score = 80 && len ctx.Matches 0 { return Decision{Block: true, Code: 403, Reason: "score+rules"} } // Rate limited, no rule match: 429, not 403 if ctx.RateLimited && len ctx.Matches == 0 { return Decision{Block: true, Code: 429, Reason: "rate-limit"} } return Decision{Block: false} } The 403 vs 429 distinction is operational. Repeated 429s from the same IP often turn out to be misconfigured clients or internal tooling; 403s with rule matches are almost always actual attacks. The alerting pipeline treats them differently, which matters at 2am when you're deciding whether to page someone. Verdict: Block, 403, SQLI-001. Response goes out before logging. Logging is I/O. I/O is slow. Those two facts mean the log write cannot touch the response path. type WAF struct { logCh chan IncidentLog // buffered } func NewWAF cfg Config WAF { w := &WAF{ logCh: make chan IncidentLog, 4096 , } go w.logWorker return w } func waf WAF logWorker { for entry := range waf.logCh { waf.sink.Write entry // JSON to disk + forward to alert pipeline } } func waf WAF block w http.ResponseWriter, r http.Request, ctx RequestContext, d Decision { // Response first w.Header .Set "Content-Type", "application/json" w.WriteHeader d.Code w.Write byte {"error":"Forbidden"} // Log asynchronously — non-blocking send select { case waf.logCh <- IncidentLog{ Timestamp: time.Now .UTC , IP: ctx.IP, Method: r.Method, Path: r.URL.Path, Score: ctx.Score, Matches: ctx.Matches, RateLimited: ctx.RateLimited, Decision: d, LatencyMs: float64 time.Since ctx.Start .Microseconds / 1000, }: default: // Channel full — drop the entry, track the drop count separately waf.metrics.LogDropped.Inc } go waf.reputation.Increment ctx.IP, 20 } The select with default is intentional. If the log channel fills — writer goroutine falling behind, usually disk I/O saturation during a large attack — drop the log entry rather than stall HTTP responses. Track the drop counter as a separate metric and alert on it. In 8 months of production this happened once, during a coordinated multi-client attack that was also saturating the disk writer. Logging should never affect response latency, even under that load. The attacker gets: HTTP/1.1 403 Forbidden Content-Type: application/json {"error":"Forbidden"} No indication of which rule fired. Nothing actionable. The less information a 403 carries, the harder the system is to probe. At peak ~180 req/s across all clients , the WAF added a median 0.8ms latency to allowed requests. p99: 3.2ms. Blocked requests averaged 1.9ms — they exit earlier in the pipeline. Memory at steady state: ~90MB for the reputation map, rate limiter state, and rule engine combined. Over 8 months: 25% reduction in breach incidents across client websites, 35% faster detection from attack onset to alert. The detection improvement came almost entirely from centralized structured logging — correlating patterns across 50+ clients simultaneously instead of treating each site's logs as a separate silo. Two things I'd rebuild differently. First: MaxMind GeoLite for ASN-level blocking from the start. Maintaining a CIDR list manually is reactive by nature and you're always a step behind. Second: weight rule matches by position in the payload. A pattern found deep inside a multi-part encoded body is more likely to be deliberate evasion than one sitting in a raw field — that distinction should influence severity scoring, and currently it doesn't. Want more deep-dive backend stories? I regularly write about: Go internals and performance backend system design building open-source tools real-world optimization stories Check out my personal site: https://bhavyyadav25.github.io https://bhavyyadav25.github.io You can also find me on: GitHub: https://github.com/Bhavyyadav25 https://github.com/Bhavyyadav25 LinkedIn: https://linkedin.com/in/yadavbhavy https://linkedin.com/in/yadavbhavy Backend engineer. Go, distributed systems, security infrastructure.