{"slug": "needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces", "title": "Needle in a haystack: measuring the impact of two nginx RCEs", "summary": "Two critical heap buffer overflow vulnerabilities in nginx's rewrite engine, CVE-2026-42945 and CVE-2026-9256, allow attackers to trigger remote code execution when specific directive combinations cause the script engine to miscalculate buffer sizes. Researchers at Calif.io built a static vulnerability scanner called ngxray and scanned 35,633 nginx configurations from GitHub to determine how many real-world deployments actually contain the vulnerable patterns. The analysis found that the bugs require uncommon configuration combinations involving flagless rewrite directives followed by set or if statements, making exploitation rare in practice.", "body_md": "# Needle in a haystack: measuring the impact of two nginx RCEs\n\n### Two critical CVEs, 35633 configs scraped from GitHub, and a question: does anyone actually write nginx configs that trigger these bugs?\n\nWe had a lot of fun [hacking nginx earlier this year](https://blog.calif.io/p/claude-humans-vs-nginx-cve-2026-27654). We know from experience that finding a real RCE in nginx is hard, especially one that triggers in a default or commonly-used configuration.\n\nSo when F5 disclosed [CVE-2026-42945](https://my.f5.com/manage/s/article/K000161019) (better known as `nginx-rift`\n\n) and [CVE-2026-9256](https://my.f5.com/manage/s/article/K000161377) (possibly `nginx-poolslip`\n\n), two critical heap buffer overflows in the nginx rewrite engine, the natural question was: how many real-world configurations are actually vulnerable?\n\nTo answer that, we built [ngxray](https://github.com/califio/ngxray), a static vulnerability scanner for nginx configs, and pointed it at GitHub.\n\n## The bugs\n\nBoth CVEs are heap buffer overflows in nginx's rewrite-phase script engine. They're distinct bugs, but they share a root cause: the engine sizes a buffer in one pass and fills it in another. A heap overflow arises when certain directive combinations cause the two passes to disagree on how much space is needed.\n\n### CVE-2026-42945: the stale flag\n\nWhen a `rewrite`\n\nreplacement contains `?`\n\n, the script engine compiles a call to [ ngx_http_script_start_args_code](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1024-L1032), which sets\n\n`e->is_args = 1`\n\n. This flag tells the capture-copy function to URI-escape data: `+`\n\nbecomes `%2B`\n\n, a 3x size increase.When the rewrite finishes, [ regex_end_code](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1195-L1205) resets\n\n`e->quote`\n\nbut, before the fix, did not reset `e->is_args`\n\n:\n\n``` php\ne->quote = 0;\n// e->is_args = 0;  <-- missing before the fix\n```\n\nIf the rewrite has no flag (`last`\n\n, `break`\n\n, `redirect`\n\n, `permanent`\n\n), the engine continues to the next directive with the stale flag still set.\n\nThis creates three distinct overflow scenarios, depending on what comes after the flagless rewrite.\n\n**The set case.** A subsequent\n\n`set $var $1`\n\ninvokes [. This function creates a zeroed sub-engine for the length pass:](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1752-L1790)\n\n`ngx_http_script_complex_value_code()`\n\n```\nngx_memzero(&le, sizeof(ngx_http_script_engine_t));  // le.is_args = 0\n```\n\nIt measures the buffer at raw capture length. But the copy pass runs through the main engine `e`\n\nwhere `e->is_args = 1`\n\n, so [ ngx_http_script_copy_capture_code](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1373-L1409) applies\n\n`ngx_escape_uri`\n\nand writes up to 3x more than the buffer holds.\n\n```\nlocation ~ ^/api/(.*)$ {\n    rewrite ^/api/(.*)$ /internal?migrated=true;\n    set $original_endpoint $1;    # $1 copied with stale is_args=1\n}\n```\n\nThis is the variant described in the original `nginx-rift`\n\nreport.\n\n**The if case.** The mechanism here is identical to the previous case, albeit with a different syntax. Both funnel the captured argument (eg\n\n`$1`\n\n) through [. The](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx_http_rewrite_module.c#L965-L1020)\n\n`ngx_http_rewrite_value()`\n\n`set`\n\nhandler calls it on the assigned value, and the `if`\n\n-condition handler calls it on the [right-hand side of the comparison](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx_http_rewrite_module.c#L716-L747).\n\nWhen that argument contains a variable, the function emits a `ngx_http_script_complex_value_code`\n\n, with its zeroed length sub-engine and stale-`is_args`\n\ncopy pass. This is the exact vulnerable code path discussed in the `set`\n\ncase.\n\n```\nlocation ~ ^/api/(.*)$ {\n    rewrite ^/api/(.*)$ /internal?migrated=true;\n    if ($request_method = $1) {    # $1 on the right-hand side hits the same bug\n        return 204;\n    }\n}\n```\n\nNot all `if`\n\noperators are affected. The `=`\n\nand `!=`\n\ncomparisons send the right-hand side through `ngx_http_rewrite_value()`\n\n, the same path `set`\n\nuses, as do the `-f`\n\n/`-d`\n\n/`-e`\n\nfile tests when applied to a capture. The regex operators (`~`\n\n, `~*`\n\n, `!~`\n\n, `!~*`\n\n) instead compile it as a regular-expression pattern, a different code path that never builds the mismatched buffer. So `if ($uri ~* $1)`\n\nis safe, while `if ($request_method = $1)`\n\nis not.\n\nAs with the `set`\n\ncase, the `if`\n\nmust appear after the rewrite in source order. If it runs first, `is_args`\n\nis still 0 and nothing overflows.\n\nOne thing worth noting: `if{}`\n\nblocks in nginx's rewrite module [compile into the same code array](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx_http_rewrite_module.c#L604-L607) as the parent location. A rewrite inside an `if{}`\n\nblock and a `set`\n\noutside it still execute in the same engine run. The `is_args`\n\nflag leaks across the `if`\n\nboundary.\n\n**The rewrite-chain case.** The stale flag can also overflow inside a second rewrite's own replacement. The first rewrite (with `?`\n\nand no flag) sets `e->is_args = 1`\n\nand continues. The second rewrite enters [ regex_start_code](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1038), which before the hardening fix did not reset\n\n`is_args`\n\n.When the second rewrite has no named variables in its replacement (only `$1`\n\n, `$2`\n\n, etc.), `regex_start_code`\n\ntakes a [fast path](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1143-L1161) for the length calculation. This fast path doesn't use a sub-engine at all. It computes the buffer size inline, adding each capture's raw byte count directly. Because `is_args`\n\nwas not reset at the top of the function, the stale flag from the first rewrite is still alive on the main engine `e`\n\n.\n\nThe copy pass then calls `ngx_http_script_copy_capture_code`\n\nfor each `$N`\n\n. That function checks `e->is_args`\n\n, sees it's 1, and applies `ngx_escape_uri`\n\n. The length pass measured raw bytes, but the copy pass writes escaped bytes. This results in the same mismatch as the `set`\n\ncase, just inside a different code path.\n\n```\nlocation / {\n    rewrite ^/(.*)$ /stage/$1?x=1;               # sets is_args, no flag\n    rewrite ^/stage/(.*)$ /destination/$1 break;  # $1 sized raw, copied escaped\n}\n```\n\nThis variant is harder to trigger in practice because the URI produced by the first rewrite must actually match the second rewrite's regex. If the first rewrites to `/index.php`\n\nand the second expects `^/admin/(.*)`\n\n, they'll never chain.\n\nIn all three cases, the request must contain bytes that expand under URI escaping (like `+`\n\nbecoming `%2B`\n\n) in the captured portion. The escaping is gated on [ e->request->quoted_uri || e->request->plus_in_uri](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1355-L1357). Without escapable characters, the size/copy mismatch is zero and no overflow occurs.\n\n### CVE-2026-9256: the budget undercount\n\nThis one lives in the fast path of [ regex_start_code](https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx_http_script.c#L1143-L1161), which handles rewrites where the replacement has no named variables. Before the\n\n[fix](https://github.com/nginx/nginx/commit/ca4f92a27464ae6c2082245e4f67048c633aa032), the length calculation budgeted escape space once over the entire URI:\n\n``` php\ne->buf.len += 2 * ngx_escape_uri(NULL, r->uri.data, r->uri.len,\n                                  NGX_ESCAPE_ARGS);\n```\n\nThen it added each capture's raw byte count. But when capture groups are nested, like `^/((.*))$`\n\n, `$1`\n\nand `$2`\n\ncover the same URI bytes. The copy pass escapes those bytes once per `$N`\n\nreference, exceeding the budget.\n\n``` bash\nrewrite ^/((.*))$ http://backend/$1$2 redirect;\n```\n\nThe rewrite must trigger URI escaping (`redirect`\n\n, `permanent`\n\n, `http://...`\n\n, or `?`\n\nin the replacement), and the replacement must reference positional captures whose groups contain each other.\n\n## Scraping GitHub\n\nUnfortunately, GitHub doesn't have a \"give me all nginx configs\" button. nginx configurations can be found not just in `.conf`\n\nfiles, but also inside Dockerfiles, shell heredocs, Jinja2 templates, ERB, Puppet manifests, Kubernetes ConfigMaps, Helm values, and Markdown documentation. A naive search for `filename:nginx.conf`\n\nmisses most of the surface area.\n\nOur [collector](https://github.com/califio/ngxray/blob/main/corpus_tools/collect_github_nginx_corpus.py) runs over 100 distinct GitHub Code Search queries:\n\nDirect configs:\n\n`language:Nginx`\n\n, filenames like`nginx.conf`\n\nand`default.conf`\n\n, paths under`conf.d/`\n\nand`sites-available/`\n\nTemplate formats:\n\n`.j2`\n\n,`.erb`\n\n,`.tmpl`\n\n,`.mustache`\n\nEmbedded configs: Dockerfiles with\n\n`COPY`\n\nor heredocs writing to`/etc/nginx`\n\n, Kubernetes YAML with nginx ConfigMap dataDocumentation: Markdown and RST with fenced nginx code blocks\n\nEach query is paginated up to GitHub's 10-page limit. Results are deduplicated by content hash. When the collector encounters a Dockerfile, it follows `COPY`\n\nsources back into the same repository to fetch the referenced config files. We made every part of the run resumable, because GitHub's rate limits mean you'll hit a wall eventually.\n\nThe raw downloads then pass through an [extraction pipeline](https://github.com/califio/ngxray/blob/main/corpus_tools/extract_nginx_configs.py) that separates the nginx config from the wrapper content surrounding it, and strips out any unsupported features, like Jinja templates.\n\nWhat comes out the other end are clean `.conf`\n\nfiles that an nginx parser can actually tokenize. The final corpus: **35,633 parseable nginx configurations** from thousands of GitHub repositories.\n\n## Parsing with nginx's own tokenizer\n\nThe `parser/`\n\ndirectory in ngxray contains a standalone C program that compiles nginx's actual tokenizer (`ngx_conf_read_token`\n\nand `ngx_conf_parse`\n\nfrom `src/core/ngx_conf_file.c`\n\n) against a patched handler. We patched `ngx_conf_handler()`\n\nto log and output the parsed syntax tree:\n\n```\nngx_int_t\nconf_handler(ngx_conf_t *cf, ngx_int_t last)\n{\n    // Records every directive into a JSON syntax tree\n    // instead of dispatching to nginx modules\n    node = conf_node_create(tree, cf);\n    conf_node_append(tree->current, node);\n    ...\n}\n```\n\nBy reusing nginx's tokenizer, we avoid reinventing the wheel, while ensuring our scanner's results match real world observations.\n\n## The rule engine\n\nThe scanner loads vulnerability signatures from JSON rule files. Each rule specifies which directives to match, structural constraints, and semantic checks specific to the vulnerability.\n\nFor CVE-2026-42945, `max_args: 2`\n\nenforces the no-flag requirement. A flagged rewrite has 3 args (regex, replacement, flag), so any rewrite with more than 2 args is safe. `ordered: true`\n\nensures the rewrite appears before the `set`\n\nin source order.\n\nFor CVE-2026-9256, the `overlapping_refs`\n\ncheck does actual PCRE parsing. It maps each `$N`\n\nreference in the replacement back to its capture group's position in the regex, then checks whether any two referenced groups physically contain each other. `not_regex: \"\\\\$[a-zA-Z_]\"`\n\nensures no named variables appear, which would force the slow path.\n\nWe wrote rules covering both CVEs: three variants of CVE-2026-42945 (the `set`\n\n, `if`\n\n, and rewrite-chain cases) and CVE-2026-9256. Each rule carries embedded test cases that the scanner validates on every run with `python3 scan.py --test`\n\n.\n\n## Results\n\nThe scanner flagged configs across several dozen repositories. The majority turned out to be PoC reproductions, scanner test fixtures, and tutorial snippets.\n\nAfter triage, the hits fell into four buckets:\n\n**One real vulnerable config.** [point/cassea](https://github.com/point/cassea), a PHP MVC framework, ships an nginx vhost config with a language-routing rewrite chain. Here's the relevant section of the `location /`\n\nblock:\n\n```\nset $controller index;\nrewrite '^([^\\.?&]*[^/])([?&#].*)?$' $1/$2;\nrewrite '^/([a-z]{2})(/.*)$' $2?__lang=$1;          # <-- sets is_args, no flag\nrewrite '^(.*)/([?&#].*)?$' $1/index.xml$2;\n\nif ($uri ~* '^/([^/\\.]{3,})(/.*)$') {\n    set $controller $1;                               # <-- $1 copied with stale is_args\n}\n```\n\nThe language rewrite on line 3 strips a two-letter prefix like `/en/...`\n\nand appends `?__lang=en`\n\n. It has no flag, so the script engine continues with `e->is_args = 1`\n\n. The `if`\n\nblock below it extracts a controller name from the rewritten URI. The `set $controller $1`\n\ninside that `if`\n\nruns through `complex_value_code`\n\nwith the stale flag.\n\nThe question is whether `$1`\n\ninside the `if`\n\ncan contain escapable characters. The `if`\n\nregex is `'^/([^/\\.]{3,})(/.*)$'`\n\n, where the first capture group matches three or more characters that aren't `/`\n\nor `.`\n\n. That includes `+`\n\n.\n\nA request to `/en/++++++++++++++++++++++++/whatever`\n\npasses through the language rewrite (stripping `/en`\n\n), producing `/++++++++++++++++++++++++/whatever?__lang=en`\n\n. The `if`\n\nregex then matches, capturing `++++++++++++++++++++++++`\n\ninto `$1`\n\n. The `set`\n\nsizes the buffer at 24 raw bytes, but the copy pass escapes each `+`\n\nto `%2B`\n\n, writing 72 bytes.\n\nWe built a minimal reproduction and ran it in Docker against nginx compiled with AddressSanitizer:\n\n```\n==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x511000001b48\nSUMMARY: AddressSanitizer: heap-buffer-overflow src/core/ngx_string.c:1689 in ngx_escape_uri\n```\n\nThe project itself is abandoned: a PHP5 framework last updated in 2011, 3 stars, zero forks, homepage offline. As far as we can tell, nobody is running this specific config. But the pattern it uses, language prefix stripping via flagless rewrite with `?`\n\n, is a legitimate design that someone could independently arrive at.\n\n**Documentation and tutorials.** A handful of repos contained the vulnerable pattern inside Markdown exercise files and blog posts. Anyone who copies these snippets into a real config inherits the bug. One recurring example is an image-processing tutorial:\n\n```\nrewrite ^/images/([a-z]{2})/([a-z0-9]{5})/(.*)\\.(png|jpg|gif)$ /data?file=$3.$4;\nset $image_file $3;\n```\n\nTwo Chinese-language nginx tutorial repos had this pattern. We confirmed it crashes with a request to `/images/en/ab12c/+++...+++.jpg`\n\n, where `$3`\n\ncaptures the plus signs and the stale `is_args`\n\ndoes the rest.\n\n**PoC and lab environments.** About a dozen repos were intentional CVE reproductions: `nginx-rift-private-lab`\n\n, `CVE-2026-42945`\n\n, `cve-2026-42945-nginx32-lab`\n\n, and so on. These all use the standard `/api/(.*)`\n\ntrigger from the original advisory. They're doing exactly what they're supposed to do.\n\n**Scanner test fixtures.** Four repos were test cases for other nginx linting tools, with files named `vulnerable.conf`\n\nand `bad.conf`\n\n.\n\n### The chain variant\n\nThe rewrite-chain variant deserves separate mention, because it shows how the triage pipeline works.\n\nThe scanner produced 29 raw matches. Then the filters kicked in:\n\n```\n| Stage                              | Count |\n|------------------------------------|-------|\n| Raw chain-rule matches             | 29    |\n| After `$scheme://` redirect filter | 28    |\n| After literal-prefix filter        | 7     |\n| After manual review                | 0     |\n```\n\nThe `$scheme://`\n\nfilter catches rewrites where the replacement starts with `http://`\n\nor `$scheme`\n\n. These are implicit redirects, so nginx returns a 3xx and stops processing. No chaining occurs.\n\nThe literal-prefix filter compares the first rewrite's output URI against the second rewrite's regex: if the first rewrites to `/index.php`\n\nand the second requires `^/admin/ads/edit/`\n\n, they can't chain.\n\nThe remaining 7 findings all had second regexes starting with a capture group, which the scanner can't rule out statically. Manual review killed all of them. One config rewrites to `/journo`\n\nbut the second regex requires `^/([a-zA-Z0-9]+-...)/rss$`\n\n, and `/journo`\n\nhas no `-`\n\nor `/rss`\n\nsuffix. Another rewrites to `/index.php`\n\nbut the second regex is `^/@(\\w+)/(following|followers)`\n\n, and `/index.php`\n\ndoesn't start with `/@`\n\n.\n\n## What this means\n\nWe are living through the first AI Bugmageddon, and it has produced a lot of noise alongside real findings. We've contributed to some of that noise ourselves, so we are not in a position to judge anyone. But that's exactly why this kind of triage matters: defenders need to know which CVEs apply to their infrastructure and which ones they can deprioritize.\n\nIn this instance, the bugs are real and exploitable, but their real-world impact is likely low. Both CVEs rely on config patterns that almost never appear in production: CVE-2026-42945 requires a flagless rewrite with `?`\n\nfollowed by `set`\n\nor `if`\n\nreferencing positional captures; CVE-2026-9256 requires nested capture groups where the replacement references multiple overlapping groups. Out of 35,633 configs, we found one vulnerable config, in an abandoned project.\n\nThe caveat is that GitHub skews toward examples, tutorials, and small projects. Complex rewrite chains for language routing or URL migration tend to live in private infrastructure repos and configuration management systems that never touch public GitHub. The `point/cassea`\n\npattern, language prefix stripping via a flagless `?`\n\nrewrite, is a reasonable multilingual design that any organization could independently arrive at.\n\nThat said, these are still unauthenticated heap overflows. One vulnerable config in production is enough to cause denial of service or worse.\n\n## Try it\n\n[ngxray](https://github.com/califio/ngxray) is open source. Point it at your configs:\n\n```\ngit clone https://github.com/califio/ngxray && cd ngxray\ngit submodule update --init && make\npython3 scan.py /etc/nginx/\n```\n\nIf you're running nginx < 1.31.1, check your rewrite directives. Look for flagless rewrites with `?`\n\nin the replacement followed by `set`\n\nor `if`\n\nusing `$1`\n\n-`$9`\n\n. Look for rewrite regexes with nested capture groups whose `$N`\n\nreferences overlap.\n\nOr just run the scanner.", "url": "https://wpnews.pro/news/needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces", "canonical_source": "https://blog.calif.io/p/needle-in-a-haystack-measuring-the", "published_at": "2026-05-29 20:27:18+00:00", "updated_at": "2026-05-29 23:56:27.935798+00:00", "lang": "en", "topics": ["ai-research", "ai-tools", "ai-products", "ai-infrastructure", "ai-safety"], "entities": ["F5", "nginx", "GitHub", "ngxray", "CVE-2026-42945", "CVE-2026-9256", "Calif.io"], "alternates": {"html": "https://wpnews.pro/news/needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces", "markdown": "https://wpnews.pro/news/needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces.md", "text": "https://wpnews.pro/news/needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces.txt", "jsonld": "https://wpnews.pro/news/needle-in-a-haystack-measuring-the-impact-of-two-nginx-rces.jsonld"}}