Needle in a haystack: measuring the impact of two nginx RCEs

Two critical heap buffer overflow vulnerabilities in nginx's rewrite engine, CVE-2026-42945 and CVE-2026-9256, allow attackers to trigger remote code execution when specific directive combinations cause the script engine to miscalculate buffer sizes. Researchers at Calif.io built a static vulnerability scanner called ngxray and scanned 35,633 nginx configurations from GitHub to determine how many real-world deployments actually contain the vulnerable patterns. The analysis found that the bugs require uncommon configuration combinations involving flagless rewrite directives followed by set or if statements, making exploitation rare in practice.

Needle in a haystack: measuring the impact of two nginx RCEs Two critical CVEs, 35633 configs scraped from GitHub, and a question: does anyone actually write nginx configs that trigger these bugs? We had a lot of fun hacking nginx earlier this year https://blog.calif.io/p/claude-humans-vs-nginx-cve-2026-27654 . We know from experience that finding a real RCE in nginx is hard, especially one that triggers in a default or commonly-used configuration. So when F5 disclosed CVE-2026-42945 https://my.f5.com/manage/s/article/K000161019 better known as nginx-rift and CVE-2026-9256 https://my.f5.com/manage/s/article/K000161377 possibly nginx-poolslip , two critical heap buffer overflows in the nginx rewrite engine, the natural question was: how many real-world configurations are actually vulnerable? To answer that, we built ngxray https://github.com/califio/ngxray , a static vulnerability scanner for nginx configs, and pointed it at GitHub. The bugs Both CVEs are heap buffer overflows in nginx's rewrite-phase script engine. They're distinct bugs, but they share a root cause: the engine sizes a buffer in one pass and fills it in another. A heap overflow arises when certain directive combinations cause the two passes to disagree on how much space is needed. CVE-2026-42945: the stale flag When a rewrite replacement contains ? , the script engine compiles a call to ngx http script start args code https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1024-L1032 , which sets e- is args = 1 . This flag tells the capture-copy function to URI-escape data: + becomes %2B , a 3x size increase.When the rewrite finishes, regex end code https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1195-L1205 resets e- quote but, before the fix, did not reset e- is args : php e- quote = 0; // e- is args = 0; <-- missing before the fix If the rewrite has no flag last , break , redirect , permanent , the engine continues to the next directive with the stale flag still set. This creates three distinct overflow scenarios, depending on what comes after the flagless rewrite. The set case. A subsequent set $var $1 invokes . This function creates a zeroed sub-engine for the length pass: https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1752-L1790 ngx http script complex value code ngx memzero &le, sizeof ngx http script engine t ; // le.is args = 0 It measures the buffer at raw capture length. But the copy pass runs through the main engine e where e- is args = 1 , so ngx http script copy capture code https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1373-L1409 applies ngx escape uri and writes up to 3x more than the buffer holds. location ~ ^/api/ . $ { rewrite ^/api/ . $ /internal?migrated=true; set $original endpoint $1; $1 copied with stale is args=1 } This is the variant described in the original nginx-rift report. The if case. The mechanism here is identical to the previous case, albeit with a different syntax. Both funnel the captured argument eg $1 through . The https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx http rewrite module.c L965-L1020 ngx http rewrite value set handler calls it on the assigned value, and the if -condition handler calls it on the right-hand side of the comparison https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx http rewrite module.c L716-L747 . When that argument contains a variable, the function emits a ngx http script complex value code , with its zeroed length sub-engine and stale- is args copy pass. This is the exact vulnerable code path discussed in the set case. location ~ ^/api/ . $ { rewrite ^/api/ . $ /internal?migrated=true; if $request method = $1 { $1 on the right-hand side hits the same bug return 204; } } Not all if operators are affected. The = and = comparisons send the right-hand side through ngx http rewrite value , the same path set uses, as do the -f / -d / -e file tests when applied to a capture. The regex operators ~ , ~ , ~ , ~ instead compile it as a regular-expression pattern, a different code path that never builds the mismatched buffer. So if $uri ~ $1 is safe, while if $request method = $1 is not. As with the set case, the if must appear after the rewrite in source order. If it runs first, is args is still 0 and nothing overflows. One thing worth noting: if{} blocks in nginx's rewrite module compile into the same code array https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/modules/ngx http rewrite module.c L604-L607 as the parent location. A rewrite inside an if{} block and a set outside it still execute in the same engine run. The is args flag leaks across the if boundary. The rewrite-chain case. The stale flag can also overflow inside a second rewrite's own replacement. The first rewrite with ? and no flag sets e- is args = 1 and continues. The second rewrite enters regex start code https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1038 , which before the hardening fix did not reset is args .When the second rewrite has no named variables in its replacement only $1 , $2 , etc. , regex start code takes a fast path https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1143-L1161 for the length calculation. This fast path doesn't use a sub-engine at all. It computes the buffer size inline, adding each capture's raw byte count directly. Because is args was not reset at the top of the function, the stale flag from the first rewrite is still alive on the main engine e . The copy pass then calls ngx http script copy capture code for each $N . That function checks e- is args , sees it's 1, and applies ngx escape uri . The length pass measured raw bytes, but the copy pass writes escaped bytes. This results in the same mismatch as the set case, just inside a different code path. location / { rewrite ^/ . $ /stage/$1?x=1; sets is args, no flag rewrite ^/stage/ . $ /destination/$1 break; $1 sized raw, copied escaped } This variant is harder to trigger in practice because the URI produced by the first rewrite must actually match the second rewrite's regex. If the first rewrites to /index.php and the second expects ^/admin/ . , they'll never chain. In all three cases, the request must contain bytes that expand under URI escaping like + becoming %2B in the captured portion. The escaping is gated on e- request- quoted uri || e- request- plus in uri https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1355-L1357 . Without escapable characters, the size/copy mismatch is zero and no overflow occurs. CVE-2026-9256: the budget undercount This one lives in the fast path of regex start code https://github.com/nginx/nginx/blob/6e14e954aaacce9a433d9b07b4653809c7594ab8/src/http/ngx http script.c L1143-L1161 , which handles rewrites where the replacement has no named variables. Before the fix https://github.com/nginx/nginx/commit/ca4f92a27464ae6c2082245e4f67048c633aa032 , the length calculation budgeted escape space once over the entire URI: php e- buf.len += 2 ngx escape uri NULL, r- uri.data, r- uri.len, NGX ESCAPE ARGS ; Then it added each capture's raw byte count. But when capture groups are nested, like ^/ . $ , $1 and $2 cover the same URI bytes. The copy pass escapes those bytes once per $N reference, exceeding the budget. bash rewrite ^/ . $ http://backend/$1$2 redirect; The rewrite must trigger URI escaping redirect , permanent , http://... , or ? in the replacement , and the replacement must reference positional captures whose groups contain each other. Scraping GitHub Unfortunately, GitHub doesn't have a "give me all nginx configs" button. nginx configurations can be found not just in .conf files, but also inside Dockerfiles, shell heredocs, Jinja2 templates, ERB, Puppet manifests, Kubernetes ConfigMaps, Helm values, and Markdown documentation. A naive search for filename:nginx.conf misses most of the surface area. Our collector https://github.com/califio/ngxray/blob/main/corpus tools/collect github nginx corpus.py runs over 100 distinct GitHub Code Search queries: Direct configs: language:Nginx , filenames like nginx.conf and default.conf , paths under conf.d/ and sites-available/ Template formats: .j2 , .erb , .tmpl , .mustache Embedded configs: Dockerfiles with COPY or heredocs writing to /etc/nginx , Kubernetes YAML with nginx ConfigMap dataDocumentation: Markdown and RST with fenced nginx code blocks Each query is paginated up to GitHub's 10-page limit. Results are deduplicated by content hash. When the collector encounters a Dockerfile, it follows COPY sources back into the same repository to fetch the referenced config files. We made every part of the run resumable, because GitHub's rate limits mean you'll hit a wall eventually. The raw downloads then pass through an extraction pipeline https://github.com/califio/ngxray/blob/main/corpus tools/extract nginx configs.py that separates the nginx config from the wrapper content surrounding it, and strips out any unsupported features, like Jinja templates. What comes out the other end are clean .conf files that an nginx parser can actually tokenize. The final corpus: 35,633 parseable nginx configurations from thousands of GitHub repositories. Parsing with nginx's own tokenizer The parser/ directory in ngxray contains a standalone C program that compiles nginx's actual tokenizer ngx conf read token and ngx conf parse from src/core/ngx conf file.c against a patched handler. We patched ngx conf handler to log and output the parsed syntax tree: ngx int t conf handler ngx conf t cf, ngx int t last { // Records every directive into a JSON syntax tree // instead of dispatching to nginx modules node = conf node create tree, cf ; conf node append tree- current, node ; ... } By reusing nginx's tokenizer, we avoid reinventing the wheel, while ensuring our scanner's results match real world observations. The rule engine The scanner loads vulnerability signatures from JSON rule files. Each rule specifies which directives to match, structural constraints, and semantic checks specific to the vulnerability. For CVE-2026-42945, max args: 2 enforces the no-flag requirement. A flagged rewrite has 3 args regex, replacement, flag , so any rewrite with more than 2 args is safe. ordered: true ensures the rewrite appears before the set in source order. For CVE-2026-9256, the overlapping refs check does actual PCRE parsing. It maps each $N reference in the replacement back to its capture group's position in the regex, then checks whether any two referenced groups physically contain each other. not regex: "\\$ a-zA-Z " ensures no named variables appear, which would force the slow path. We wrote rules covering both CVEs: three variants of CVE-2026-42945 the set , if , and rewrite-chain cases and CVE-2026-9256. Each rule carries embedded test cases that the scanner validates on every run with python3 scan.py --test . Results The scanner flagged configs across several dozen repositories. The majority turned out to be PoC reproductions, scanner test fixtures, and tutorial snippets. After triage, the hits fell into four buckets: One real vulnerable config. point/cassea https://github.com/point/cassea , a PHP MVC framework, ships an nginx vhost config with a language-routing rewrite chain. Here's the relevant section of the location / block: set $controller index; rewrite '^ ^\.?& ^/ ?& . ?$' $1/$2; rewrite '^/ a-z {2} /. $' $2? lang=$1; <-- sets is args, no flag rewrite '^ . / ?& . ?$' $1/index.xml$2; if $uri ~ '^/ ^/\. {3,} /. $' { set $controller $1; <-- $1 copied with stale is args } The language rewrite on line 3 strips a two-letter prefix like /en/... and appends ? lang=en . It has no flag, so the script engine continues with e- is args = 1 . The if block below it extracts a controller name from the rewritten URI. The set $controller $1 inside that if runs through complex value code with the stale flag. The question is whether $1 inside the if can contain escapable characters. The if regex is '^/ ^/\. {3,} /. $' , where the first capture group matches three or more characters that aren't / or . . That includes + . A request to /en/++++++++++++++++++++++++/whatever passes through the language rewrite stripping /en , producing /++++++++++++++++++++++++/whatever? lang=en . The if regex then matches, capturing ++++++++++++++++++++++++ into $1 . The set sizes the buffer at 24 raw bytes, but the copy pass escapes each + to %2B , writing 72 bytes. We built a minimal reproduction and ran it in Docker against nginx compiled with AddressSanitizer: ==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x511000001b48 SUMMARY: AddressSanitizer: heap-buffer-overflow src/core/ngx string.c:1689 in ngx escape uri The project itself is abandoned: a PHP5 framework last updated in 2011, 3 stars, zero forks, homepage offline. As far as we can tell, nobody is running this specific config. But the pattern it uses, language prefix stripping via flagless rewrite with ? , is a legitimate design that someone could independently arrive at. Documentation and tutorials. A handful of repos contained the vulnerable pattern inside Markdown exercise files and blog posts. Anyone who copies these snippets into a real config inherits the bug. One recurring example is an image-processing tutorial: rewrite ^/images/ a-z {2} / a-z0-9 {5} / . \. png|jpg|gif $ /data?file=$3.$4; set $image file $3; Two Chinese-language nginx tutorial repos had this pattern. We confirmed it crashes with a request to /images/en/ab12c/+++...+++.jpg , where $3 captures the plus signs and the stale is args does the rest. PoC and lab environments. About a dozen repos were intentional CVE reproductions: nginx-rift-private-lab , CVE-2026-42945 , cve-2026-42945-nginx32-lab , and so on. These all use the standard /api/ . trigger from the original advisory. They're doing exactly what they're supposed to do. Scanner test fixtures. Four repos were test cases for other nginx linting tools, with files named vulnerable.conf and bad.conf . The chain variant The rewrite-chain variant deserves separate mention, because it shows how the triage pipeline works. The scanner produced 29 raw matches. Then the filters kicked in: | Stage | Count | |------------------------------------|-------| | Raw chain-rule matches | 29 | | After $scheme:// redirect filter | 28 | | After literal-prefix filter | 7 | | After manual review | 0 | The $scheme:// filter catches rewrites where the replacement starts with http:// or $scheme . These are implicit redirects, so nginx returns a 3xx and stops processing. No chaining occurs. The literal-prefix filter compares the first rewrite's output URI against the second rewrite's regex: if the first rewrites to /index.php and the second requires ^/admin/ads/edit/ , they can't chain. The remaining 7 findings all had second regexes starting with a capture group, which the scanner can't rule out statically. Manual review killed all of them. One config rewrites to /journo but the second regex requires ^/ a-zA-Z0-9 +-... /rss$ , and /journo has no - or /rss suffix. Another rewrites to /index.php but the second regex is ^/@ \w+ / following|followers , and /index.php doesn't start with /@ . What this means We are living through the first AI Bugmageddon, and it has produced a lot of noise alongside real findings. We've contributed to some of that noise ourselves, so we are not in a position to judge anyone. But that's exactly why this kind of triage matters: defenders need to know which CVEs apply to their infrastructure and which ones they can deprioritize. In this instance, the bugs are real and exploitable, but their real-world impact is likely low. Both CVEs rely on config patterns that almost never appear in production: CVE-2026-42945 requires a flagless rewrite with ? followed by set or if referencing positional captures; CVE-2026-9256 requires nested capture groups where the replacement references multiple overlapping groups. Out of 35,633 configs, we found one vulnerable config, in an abandoned project. The caveat is that GitHub skews toward examples, tutorials, and small projects. Complex rewrite chains for language routing or URL migration tend to live in private infrastructure repos and configuration management systems that never touch public GitHub. The point/cassea pattern, language prefix stripping via a flagless ? rewrite, is a reasonable multilingual design that any organization could independently arrive at. That said, these are still unauthenticated heap overflows. One vulnerable config in production is enough to cause denial of service or worse. Try it ngxray https://github.com/califio/ngxray is open source. Point it at your configs: git clone https://github.com/califio/ngxray && cd ngxray git submodule update --init && make python3 scan.py /etc/nginx/ If you're running nginx < 1.31.1, check your rewrite directives. Look for flagless rewrites with ? in the replacement followed by set or if using $1 - $9 . Look for rewrite regexes with nested capture groups whose $N references overlap. Or just run the scanner.