cd /news/ai-safety/dirty-frag-a-failed-public-poc-did-n… · home topics ai-safety article
[ARTICLE · art-17059] src=shauryaa.dev pub= topic=ai-safety verified=true sentiment=· neutral

Dirty Frag: a failed public PoC did not mean the server was safe

A university login server at IIT Delhi that appeared secure after blocking a public Dirty Frag exploit was compromised in approximately 90 minutes using a DeepSeek-V4-Flash automated feedback loop. The attacker exploited a missed `pcbc(fcrypt)` kernel crypto path that remained accessible despite the server's mitigations, achieving root access and prompting a proper patch within an hour.

read8 min publishedMay 29, 2026

May 29, 2026

a public Dirty Frag PoC failed, so the server looked safe. a cheap DeepSeek-V4-Flash feedback loop found the missed path -> fcrypt mismatch, nscd cache, and root in ~90 minutes.

the short version #

i got root on my university's shared login server. not because the sysadmins were asleep or because the box was some ancient forgotten machine. they were actually fast. they had read the CVE writeups, blocked the recommended kernel crypto interface, disabled unprivileged user namespaces, tested the public PoC, watched it fail, and moved on.

reasonable response tbh. the problem is that public PoCs are not truth oracles. they only tell you one thing: this exact code path, as written, did not work on this exact run. that is not the same as "the system is safe".

so i put DeepSeek-V4-Flash in a boring shell feedback loop on a Lightsail replica:

compile -> run -> read error -> patch -> repeat

about 90 minutes later, the exploit worked on the real server. i reported it, and they patched it properly within an hour. this is mostly a field-trip blog on the whole journey to get root access and how cheap intelligence has gotten if harnessed correctly.

the target #

this was ssh1.iitd.ac.in

, a shared login box for the electrical engineering department at IIT Delhi. my friend had gotten root on it last year when it was running Linux kernel 4 on Ubuntu 16. ancient era. this time it was not ancient anymore.

initial state:

thing value
kernel Linux 6.8.0-111-generic , built April 11 2026
distro

kernel.unprivileged_userns_clone = 0

rulesmodprobe

important bit: AF_ALG

was blocked, but pcbc(fcrypt)

was still registered in /proc/crypto

. foreshadowing...

background, quickly #

Dirty Frag is a Linux kernel privilege escalation from May 2026. it has two main paths: CVE-2026-43284 -> xfrm ESP path, and CVE-2026-43500 -> RxRPC path.

both get to a page-cache corruption primitive during kernel crypto work. the public exploit uses that to temporarily modify /etc/passwd

, blank root's password, then su

to root. very funny family of exploits.

AF_ALG is the kernel's userspace crypto socket API. most writeups said: block

AF_ALG

, block the algif modules, you're good until you patch. this is a really good patch as it stops the public poc from working.## first attempt

i first tried Copy Fail, mostly because it had dropped around the same time and looked like the obvious thing to test. it died immediately:

socket(AF_ALG, ...) = -1 EAFNOSUPPORT

fair enough. mitigation worked.

then i tried Dirty Frag.

ESP path -> blocked because unprivileged user namespaces were disabled. RxRPC path -> got further, because creating an AF_RXRPC

socket caused rxrpc

, fcrypt

, and pcbc

to auto-load. but it still failed at the checksum step because the public PoC expected to use AF_ALG

.

at this point the obvious conclusion was:

both cves blocked -> public poc failed -> server safe

most people will probably stop there. but pcbc(fcrypt)

was still in /proc/crypto

, so the question became: if the kernel can still use the algorithm internally, why is userspace AF_ALG

the final blocker?

the loop #

this is where deepseek v4 flash, the cheapest model known to mankind, did the useful work.

the loop was boring:

compile -> run -> read stderr + dmesg + return code -> patch code -> repeat

i love boring loops coz they work and i can track them.

also slightly cursed realization: this is just control engineering. this semester i was studying ELL 225: Control Engineering. i did not expect any of that to make my prompting better, but it did.

LLM in a harness feels like a controller: goal prompt -> setpoint shell/tool environment -> plant stderr/dmesg/tests -> sensor code edits -> control input repeated runs -> feedback loop.

what the agent found #

1. public PoC fcrypt != kernel fcrypt

this was the actual turn. within about 9 minutes, the agent noticed that the fcrypt

implementation in the public PoC did not match the kernel's pcbc(fcrypt-generic)

implementation. different round structure. different key mixing. byte order weirdness.

i will be honest here, i had no idea about this and i was already far from my home waters. dipping my toes first time in cybersecurity.

2. byte order in fcrypt_user_setkey

on the Lightsail replica, the agent fixed the byte order in fcrypt_user_setkey

so the userspace fallback matched the kernel schedule. small and boring but important fix.

3. POC_NO_UNSHARE

there was also a POC_NO_UNSHARE

path already sitting in the codebase. using it got past the user namespace setup and moved the exploit from immediate failure to rc=3

.

i feel stupid for not checking this myself first.

4. nscd

final blocker was nscd. on Ubuntu 24.04,

nscd

can cache passwd lookups. so even after the page-cache corruption modified /etc/passwd

, PAM could still see the old cached root:x:0:0:...

entry and reject the blank password. the exploit looked like it failed even when the file had already changed.fix:

systemctl is-active --quiet nscd && nscd --invalidate passwd

root and cleanup #

final run on the real server was around 05:24 IST on May 22. first shot failed because of nscd. patched that. second shot worked. root shell.

cleanup was straightforward: restored original /etc/passwd

, unloaded rxrpc

, fcrypt

, and pcbc

, dropped caches, deleted the exploit binary, and checked syslog for the RxRPC unregister messages.

reported to sysadmins within the hour. they patched properly within another hour by upgrading the kernel to Linux 6.8.0-117-generic

.

the actual failure #

this is not a "lol sysadmins bad" post. they did the obvious mitigation and verified it against the obvious public exploit. problem is that the obvious exploit was not the full search space.

they treated public PoC failed

same as systems are safe

.

those are not the same statement.

three things slipped through:

  • the RxRPC path auto-loaded modules the AF_ALG mitigation did not cover
  • the public PoC had an fcrypt mismatch, so its failure was partially a PoC bug
  • nobody iterated after the first failure

why this matters for agents #

deepseek, and other chinese models like glm 5.1 are CHEAPP. like, stupid cheap compared to the american counterparts.

for this problem, raw model intelligence was not the bottleneck. it did not need to be a god model like mythos. it needed to compile code, run code, read errors, compare source, and try again. that's it.

this is why i think cheap subagents + feedback loops are underrated for most tasks. the useful question is often not:

can this model solve the whole thing in one shot?

it is:

can this model keep trying sane variants without getting bored?

mythos and other god-tier models are good at seeing the bigger chain. cheap subagents are good at grinding through the local search space once you know where to look.

mitigation #

real fix: patch your kernel.

upstream commits:

-> ESP pathf4c50a4034e6

-> RxRPC pathaa54b1d27fe0

if you cannot patch immediately, don't only block AF_ALG

. block protocol modules too:

printf 'install esp4 /bin/false
install esp6 /bin/false
install rxrpc /bin/false
install af_alg /bin/false
' > /etc/modprobe.d/dirtyfrag.conf

rmmod esp4 esp6 rxrpc af_alg 2>/dev/null
sync && echo 3 > /proc/sys/vm/drop_caches

and if you're testing a mitigation, don't stop at public PoC failed

. run it in a staging environment through a loop. make the agent explain why it failed, then make it try the branch it thinks should still work.

timeline #

phase start (IST) end duration active
Copy Fail attempt + AF_ALG discovery May 21 23:38 23:45 ~7 min yes
Dirty Frag pivot + first RxRPC test 23:46 23:55 ~9 min yes
fcrypt mismatch discovery 23:56 00:05 ~9 min yes
Lightsail testing + byte order fix May 22 00:05 00:20 ~15 min yes
gap / sleep / other things 00:20 04:05 ~3h 45m no
final test on real server 04:05 05:24 ~1h 19m yes
root + cleanup 05:24 05:46 ~22 min yes
damage assessment + report 05:46 05:50 ~4 min yes

total: about 95 minutes of active agent work, about 10 minutes of my active time, about 90 minutes wall-clock from first prompt to root shell.

closing #

god-tier models like mythos and cheap subagents are two entirely different beasts.

one is good at chaining different small pieces into a bigger exploit path. the other is good at exploring subspaces around a known exploit and checking whether the system is actually secure.

cheap subagents may have a really good use case in defending systems against fresh CVEs:

cve reported -> agents check replica(s) of prod -> findings go to sysadmin -> mitigations get implemented if available

not glamorous. probably useful.

── more in #ai-safety 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/dirty-frag-a-failed-…] indexed:0 read:8min 2026-05-29 ·