{"slug": "what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf", "title": "What Happens in 2 Milliseconds: Anatomy of a Single HTTP Request Through a Production WAF", "summary": "A production web application firewall (WAF) processing over 100,000 daily requests across 50 client websites uses a six-stage pipeline that prioritizes cheap checks first and expensive ones last, with IP reputation running before the rule engine to avoid unnecessary regex matches. The system's IP reputation module relies on three in-memory O(1) data structures, including a scores map with a 24-hour half-life decay that stabilizes at 40,000-50,000 entries through hourly eviction of low-score IPs. The developer noted that using MaxMind GeoLite for ASN lookups from the start would have avoided several weeks of noisier detection caused by manually maintained CIDR lists.", "body_md": "The rule engine is not the hard part. Everyone builds a rule engine. The hard part is deciding what order the checks run in — because the difference between a hash map lookup and a regex match is two orders of magnitude, and you're doing this on every single request.\n\nSix-stage pipeline. Production. 50+ client websites, 100K+ daily requests. I'll trace one request through all of it.\n\n```\nhttp\nPOST /api/login HTTP/1.1\nHost: client-website.com\nUser-Agent: python-requests/2.28.0\nContent-Type: application/json\nX-Forwarded-For: 185.220.101.45\n{\"username\":\"admin' OR '1'='1' --\",\"password\":\"anything\"}\n```\n\nFour problems: Tor exit node IP, automation library User-Agent, no Accept header, SQL injection payload. It gets blocked at stage 4. But all six stages matter.\n\n```\nfunc (waf *WAF) Handle(next http.Handler) http.Handler {\n    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {\n        ctx := &RequestContext{\n            IP:    extractIP(r),\n            Start: time.Now(),\n        }\n\n        // Stage 1: IP reputation — cheapest check, runs first\n        ipScore := waf.reputation.Score(ctx.IP)\n        ctx.Score += ipScore\n        if ctx.Score >= 100 {\n            waf.block(w, r, ctx, Decision{Code: 403, Reason: \"blocklist\"})\n            return\n        }\n\n        // Stage 2: Rate limiting\n        if allowed := waf.limiter.Allow(ctx.IP); !allowed {\n            ctx.Score += 25\n            ctx.RateLimited = true\n        }\n\n        // Stage 3: Header inspection\n        headerScore, hardBlock := waf.inspectHeaders(r)\n        ctx.Score += headerScore\n        if hardBlock != \"\" {\n            waf.block(w, r, ctx, Decision{Code: 400, Reason: hardBlock})\n            return\n        }\n\n        // Stage 4: Rule engine — most expensive, runs last\n        body, _ := io.ReadAll(r.Body)\n        r.Body = io.NopCloser(bytes.NewReader(body))\n        ctx.Matches = waf.rules.Evaluate(r, body)\n\n        // Stage 5: Decision\n        if d := waf.decide(ctx); d.Block {\n            waf.block(w, r, ctx, d)\n            return\n        }\n\n        next.ServeHTTP(w, r)\n    })\n}\n```\n\nCheap checks first, expensive last. If IP reputation kills the request, the rule engine never runs. At 100K req/day that ordering shows up measurably in CPU.\n\nThree data structures, all in memory, all O(1):\n\n```\ntype IPReputation struct {\n    mu        sync.RWMutex\n    blocklist  map[string]struct{}\n    torExits   map[string]struct{}\n    cidrs      []*net.IPNet // datacenter/hosting ranges\n    scores     map[string]scoredIP\n}\n\ntype scoredIP struct {\n    score    int\n    lastSeen time.Time\n}\n\nfunc (ipr *IPReputation) Score(ip string) int {\n    ipr.mu.RLock()\n    defer ipr.mu.RUnlock()\n\n    if _, ok := ipr.blocklist[ip]; ok {\n        return 100\n    }\n    if _, ok := ipr.torExits[ip]; ok {\n        return 70\n    }\n\n    parsed := net.ParseIP(ip)\n    for _, cidr := range ipr.cidrs {\n        if cidr.Contains(parsed) {\n            return 55 // datacenter origin — not necessarily malicious, but not a browser\n        }\n    }\n\n    if entry, ok := ipr.scores[ip]; ok {\n        days := time.Since(entry.lastSeen).Hours() / 24\n        // Score halves every 24h\n        return int(float64(entry.score) * math.Pow(0.5, days))\n    }\n\n    return 0\n}\n```\n\nScore decays at a 24-hour half-life. IPs rotate. Cloud provider ranges get reassigned. Treating a month-old signal the same as a current one tanks precision — false positives climb until the system is more noise than signal.\n\nWe should have used MaxMind GeoLite from day one instead of maintaining a CIDR list manually. We added hosting ranges reactively — after seeing attacks rather than before — and missed several in the first few months. Proper ASN lookups would have caught those automatically. That gap cost us a few weeks of noisier detection early on.\n\nThe scores map has an unbounded growth problem. A background goroutine evicts entries with decayed scores below 5, running every hour. In production the map stabilized around 40-50k entries.\n\n**185.220.101.45 matches the Tor exit list. Score: 70. Continue.**\n\nSliding window, not fixed. Fixed windows have a boundary exploit: 59 requests at :59, 59 more at :00 — 118 requests through a 60-request limit. The sliding window always covers the last N seconds. There's no boundary to game.\n\n```\ntype SlidingWindow struct {\n    mu      sync.Mutex\n    entries map[string]*ipWindow\n    limit   int\n    window  time.Duration\n}\n\ntype ipWindow struct {\n    timestamps []int64 // nanoseconds — 8 bytes vs 24 for time.Time\n}\n\nfunc (sw *SlidingWindow) Allow(ip string) bool {\n    sw.mu.Lock()\n    defer sw.mu.Unlock()\n\n    now := time.Now().UnixNano()\n    cutoff := now - sw.window.Nanoseconds()\n\n    w := sw.entries[ip]\n    if w == nil {\n        w = &ipWindow{}\n        sw.entries[ip] = w\n    }\n\n    // Prune in-place — avoids allocating a new slice on every call\n    n := 0\n    for _, t := range w.timestamps {\n        if t > cutoff {\n            w.timestamps[n] = t\n            n++\n        }\n    }\n    w.timestamps = w.timestamps[:n]\n\n    if len(w.timestamps) >= sw.limit {\n        return false\n    }\n\n    w.timestamps = append(w.timestamps, now)\n    return true\n}\n```\n\nThreshold: 60 requests per 10-second window. This attacker sent 847.\n\nRate limit alone doesn't block. +25 to score, continue. A misconfigured load balancer looks identical to a rate violation — same IP, high request count. The system needs the full picture before making a hard call. Rate limit plus anything else usually crosses the block threshold.\n\n**Score: 70 + 25 = 95. Continue.**\n\nReal browsers are consistent. They send `Accept`\n\n, `Accept-Language`\n\n, `Accept-Encoding`\n\n. Their User-Agent follows recognizable patterns. Automation libraries don't replicate this — not because attackers are careless, but because `python-requests`\n\n, `httpx`\n\n, `go-http-client`\n\ndon't send browser headers by default, and most attackers don't bother faking them.\n\n``` js\nvar automationSignatures = []string{\n    \"python-requests\", \"python-urllib\", \"go-http-client\",\n    \"libwww-perl\", \"java/\", \"curl/\", \"wget/\",\n    \"sqlmap\", \"nikto\", \"masscan\", \"zgrab\", \"scrapy\",\n    \"aiohttp\", \"httpx\", \"mechanize\",\n}\n\nfunc (waf *WAF) inspectHeaders(r *http.Request) (score int, hardBlock string) {\n    ua := r.Header.Get(\"User-Agent\")\n    if ua == \"\" {\n        return 40, \"\"\n    }\n\n    uaLow := strings.ToLower(ua)\n    for _, sig := range automationSignatures {\n        if strings.Contains(uaLow, sig) {\n            score += 30\n            break\n        }\n    }\n\n    if r.Header.Get(\"Accept\") == \"\" {\n        score += 15\n    }\n\n    // POST from a browser almost always carries a Referer\n    if r.Method == http.MethodPost && r.Header.Get(\"Referer\") == \"\" {\n        score += 10\n    }\n\n    // Header injection is a hard block regardless of score\n    for _, values := range r.Header {\n        for _, v := range values {\n            if strings.ContainsAny(v, \"\\r\\n\") {\n                return 0, \"header injection\"\n            }\n        }\n    }\n\n    return score, \"\"\n}\n```\n\nHeader injection is the only hard block at this stage. `\\r\\n`\n\nin a header value is never legitimate — it can split HTTP responses and poison downstream caches. Everything else is scored and accumulated.\n\nWe evaluated TLS fingerprinting (JA3) — comparing cipher suite and extension order from the TLS handshake, which browsers expose consistently and scripts don't. Decided against it. It requires TLS termination at the WAF layer or integration with nginx's `ssl_fingerprint`\n\nmodule, and it's brittle across library versions. The coupling cost wasn't worth it at our traffic volume. Worth revisiting at scale.\n\n**python-requests/2.28.0: +30. No Accept: +15. No Referer on POST: +10. Score: 95 + 55 = 100 (capped). Continue.**\n\nMost expensive stage. Runs last.\n\n**Pre-compile at startup.** `regexp.MustCompile`\n\nis not free. Calling it per request at 100K req/day is burning CPU for no reason. All patterns compile once on server start, stored as `*regexp.Regexp`\n\nstruct fields, reused across every request.\n\n**Normalize before matching.** Attackers don't send raw `OR '1'='1'`\n\n. They URL-encode it, double-encode it, or split it across fields. A rule engine that only looks at the raw payload misses most real attacks.\n\n```\nfunc normalize(input []byte) []byte {\n    // First pass\n    s, err := url.QueryUnescape(string(input))\n    if err != nil {\n        s = string(input)\n    }\n    // Second pass — catches double-encoding\n    s2, err := url.QueryUnescape(s)\n    if err != nil {\n        s2 = s\n    }\n    return []byte(strings.ToLower(s2))\n}\n```\n\nThen the rules:\n\n```\ntype Rule struct {\n    ID       string\n    Pattern  *regexp.Regexp\n    Severity int    // 1–4; severity 4 = block unconditionally regardless of score\n    Target   Target // Body, URL, or both\n}\n\n// Compiled at init() — never at request time\nvar coreRules = []*Rule{\n    {\n        ID:       \"SQLI-001\",\n        Pattern:  regexp.MustCompile(`\\bor\\b\\s+['\"]?\\w+['\"]?\\s*=\\s*['\"]?\\w+['\"]?`),\n        Severity: 4,\n        Target:   TargetBody,\n    },\n    {\n        ID:       \"SQLI-002\",\n        Pattern:  regexp.MustCompile(`(--|#|/\\*)`),\n        Severity: 3,\n        Target:   TargetBody,\n    },\n    {\n        ID:       \"SQLI-003\",\n        Pattern:  regexp.MustCompile(`\\bunion\\b.{0,30}\\bselect\\b`),\n        Severity: 4,\n        Target:   TargetBody | TargetURL,\n    },\n    {\n        ID:       \"XSS-001\",\n        Pattern:  regexp.MustCompile(`<script[\\s/>]|javascript\\s*:`),\n        Severity: 4,\n        Target:   TargetBody | TargetURL,\n    },\n    {\n        ID:       \"PATH-001\",\n        Pattern:  regexp.MustCompile(`(\\.\\.[\\\\/]){2,}`),\n        Severity: 3,\n        Target:   TargetURL,\n    },\n    {\n        ID:       \"CMD-001\",\n        Pattern:  regexp.MustCompile(`[;|&]\\s*(cat|ls|whoami|id|wget|curl)\\b`),\n        Severity: 4,\n        Target:   TargetBody | TargetURL,\n    },\n}\n\nfunc (e *RuleEngine) Evaluate(r *http.Request, body []byte) []*Match {\n    normBody := normalize(body)\n    normURL := normalize([]byte(r.URL.RawQuery + r.URL.Path))\n\n    var matches []*Match\n    for _, rule := range e.rules {\n        var target []byte\n        if rule.Target&TargetBody != 0 {\n            target = normBody\n        } else {\n            target = normURL\n        }\n        if loc := rule.Pattern.Find(target); loc != nil {\n            matches = append(matches, &Match{Rule: rule, At: loc})\n        }\n    }\n    return matches\n}\n```\n\nAfter normalization, the body reads as: `{\"username\":\"admin' or '1'='1' --\",\"password\":\"anything\"}`\n\n.\n\nSQLI-001 fires on `or '1'='1'`\n\n. SQLI-002 fires on `--`\n\n. Two matches. SQLI-001 is severity 4. Score is irrelevant — block unconditionally.\n\nThin layer. Accumulated context in, decision out. Complexity here is where subtle edge cases live and where probing exploits get found.\n\n```\nfunc (waf *WAF) decide(ctx *RequestContext) Decision {\n    // Severity-4 match: score doesn't matter\n    for _, m := range ctx.Matches {\n        if m.Rule.Severity == 4 {\n            return Decision{Block: true, Code: 403, Reason: m.Rule.ID}\n        }\n    }\n\n    // High score + any rule match: block\n    if ctx.Score >= 80 && len(ctx.Matches) > 0 {\n        return Decision{Block: true, Code: 403, Reason: \"score+rules\"}\n    }\n\n    // Rate limited, no rule match: 429, not 403\n    if ctx.RateLimited && len(ctx.Matches) == 0 {\n        return Decision{Block: true, Code: 429, Reason: \"rate-limit\"}\n    }\n\n    return Decision{Block: false}\n}\n```\n\nThe 403 vs 429 distinction is operational. Repeated 429s from the same IP often turn out to be misconfigured clients or internal tooling; 403s with rule matches are almost always actual attacks. The alerting pipeline treats them differently, which matters at 2am when you're deciding whether to page someone.\n\n**Verdict: Block, 403, SQLI-001.**\n\nResponse goes out before logging. Logging is I/O. I/O is slow. Those two facts mean the log write cannot touch the response path.\n\n```\ntype WAF struct {\n    logCh chan IncidentLog // buffered\n}\n\nfunc NewWAF(cfg Config) *WAF {\n    w := &WAF{\n        logCh: make(chan IncidentLog, 4096),\n    }\n    go w.logWorker()\n    return w\n}\n\nfunc (waf *WAF) logWorker() {\n    for entry := range waf.logCh {\n        waf.sink.Write(entry) // JSON to disk + forward to alert pipeline\n    }\n}\n\nfunc (waf *WAF) block(w http.ResponseWriter, r *http.Request, ctx *RequestContext, d Decision) {\n    // Response first\n    w.Header().Set(\"Content-Type\", \"application/json\")\n    w.WriteHeader(d.Code)\n    w.Write([]byte(`{\"error\":\"Forbidden\"}`))\n\n    // Log asynchronously — non-blocking send\n    select {\n    case waf.logCh <- IncidentLog{\n        Timestamp:   time.Now().UTC(),\n        IP:          ctx.IP,\n        Method:      r.Method,\n        Path:        r.URL.Path,\n        Score:       ctx.Score,\n        Matches:     ctx.Matches,\n        RateLimited: ctx.RateLimited,\n        Decision:    d,\n        LatencyMs:   float64(time.Since(ctx.Start).Microseconds()) / 1000,\n    }:\n    default:\n        // Channel full — drop the entry, track the drop count separately\n        waf.metrics.LogDropped.Inc()\n    }\n\n    go waf.reputation.Increment(ctx.IP, 20)\n}\n```\n\nThe `select`\n\nwith `default`\n\nis intentional. If the log channel fills — writer goroutine falling behind, usually disk I/O saturation during a large attack — drop the log entry rather than stall HTTP responses. Track the drop counter as a separate metric and alert on it. In 8 months of production this happened once, during a coordinated multi-client attack that was also saturating the disk writer. Logging should never affect response latency, even under that load.\n\nThe attacker gets:\n\n```\nHTTP/1.1 403 Forbidden\nContent-Type: application/json\n{\"error\":\"Forbidden\"}\n```\n\nNo indication of which rule fired. Nothing actionable. The less information a 403 carries, the harder the system is to probe.\n\nAt peak (~180 req/s across all clients), the WAF added a median 0.8ms latency to allowed requests. p99: 3.2ms. Blocked requests averaged 1.9ms — they exit earlier in the pipeline. Memory at steady state: ~90MB for the reputation map, rate limiter state, and rule engine combined.\n\nOver 8 months: 25% reduction in breach incidents across client websites, 35% faster detection from attack onset to alert. The detection improvement came almost entirely from centralized structured logging — correlating patterns across 50+ clients simultaneously instead of treating each site's logs as a separate silo.\n\nTwo things I'd rebuild differently. First: MaxMind GeoLite for ASN-level blocking from the start. Maintaining a CIDR list manually is reactive by nature and you're always a step behind. Second: weight rule matches by position in the payload. A pattern found deep inside a multi-part encoded body is more likely to be deliberate evasion than one sitting in a raw field — that distinction should influence severity scoring, and currently it doesn't.\n\nWant more deep-dive backend stories?\n\nI regularly write about:\n\nGo internals and performance\n\nbackend system design\n\nbuilding open-source tools\n\nreal-world optimization stories\n\nCheck out my personal site: [https://bhavyyadav25.github.io](https://bhavyyadav25.github.io)\n\nYou can also find me on:\n\nGitHub: [https://github.com/Bhavyyadav25](https://github.com/Bhavyyadav25)\n\nLinkedIn: [https://linkedin.com/in/yadavbhavy](https://linkedin.com/in/yadavbhavy)\n\n*Backend engineer. Go, distributed systems, security infrastructure.*", "url": "https://wpnews.pro/news/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf", "canonical_source": "https://dev.to/bhavyyadav25/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-production-waf-4bcm", "published_at": "2026-05-31 00:14:29+00:00", "updated_at": "2026-05-31 00:42:13.147742+00:00", "lang": "en", "topics": ["ai-infrastructure", "mlops", "ai-products", "ai-tools"], "entities": ["WAF", "Tor"], "alternates": {"html": "https://wpnews.pro/news/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf", "markdown": "https://wpnews.pro/news/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf.md", "text": "https://wpnews.pro/news/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf.txt", "jsonld": "https://wpnews.pro/news/what-happens-in-2-milliseconds-anatomy-of-a-single-http-request-through-a-waf.jsonld"}}