V.E.L.O.C.I.T.Y.-OS: Kimi K2.7 and the 'Safe-Room Security' Illusion (Part 1)

A developer building a bare-metal operating system called V.E.L.O.C.I.T.Y.-OS discovered that Kimi K2.7, a 1-trillion parameter MoE model, exposed database credentials in generated code due to a 'safe-room security' illusion. The developer built a Gatekeeper security scanner and sandbox verifier that runs before any generated code is committed, forcing automatic self-correction loops.

It all started on June 23rd with a casual post about a VPS Manager benchmark. Out of curiosity, I decided to ask the author of the benchmark, , if he had tried Cloudflare's new Workers AI offering—specifically Kimi K2.7, a massive 1-trillion parameter MoE Mixture of Experts model that was incredibly cheap $0.27 per million input tokens and highly capable at code generation.Pascal was intrigued. He pointed out a brilliant hypothesis: if a model makes significantly fewer mistakes, the total session cost drops dramatically even if the per-token price is higher. He cited GLM 5.2 as a model that self-corrected multiple bugs during verification to achieve 37/37 tests passing. Curiosity got the better of me. I spun up my development environment, wrote a custom agent harness, and ran it on Kimi K2.7 using Cloudflare Workers AI. We are building a bare-metal, self-healing operating system running entirely inside the CPU's L3 cache. Here is the roadmap for this 12-part series: The initial run looked amazing—Kimi successfully completed 19 of the 30 foundation files on my daily free allocation, delivering the cleanest architectural layout of any model tested. But in the meantime, Pascal had run Kimi K2.7 himself and caught a major security blocker on DB credential handling. This prompted me to dig into the 19 files from my own Foundry run, only to find the exact same mistakes: Kimi had exposed database connection credentials directly in the code. Pascal pointed out that this wasn't a failure in reasoning—it was a scope failure . Kimi was operating under "safe-room security": it optimized for code correctness against the written spec, assuming it was running in a secure, isolated sandbox rather than a live production environment. Pascal suggested that rather than bloating every single system prompt with complex, instruction-taxing security warnings which models eventually ignore or drift from , I needed a systematic gateway. That conversation was the spark. I went to work on gatekeeper.rs and built a local security static analysis scanner and sandbox verifier directly into the compilation gate. The rule was simple: before any generated file could be marked as complete and persisted, the Gatekeeper ran systematic regex-based and syntax-tree scans to detect database credentials, hardcoded keys, and common security flaws. Furthermore, I wired the compiler directly into an isolated JIT sandbox AssertUnwindSafe to dry-run the generated bytecode. If the JIT compilation or the dry-run failed, the compiler rejected the output, forced the model to reflect on the diagnostic error, and triggered an automatic self-correction loop. Here is the architectural flow of how code moves from the LLM model to the secure, bare-metal storage layer: Here is the core logic from gatekeeper.rs that classifies and verifies LLM-generated code in an isolated environment before committing it to the codebase: // gatekeeper.rs — Gatekeeper Hybrid LLM Router & Sandbox Verifier pub enum LlmRoute { CloudSwarm, // High-complexity planning GPT-4o/Claude 3.5 LocalAgent, // Low-complexity execution Qwen-Coder-0.5B } pub fn classify query query: &str - LlmRoute { let q lc = query.to lowercase ; if q lc.contains "architecture" || q lc.contains "blueprint" || q lc.contains "refactor kernel" { LlmRoute::CloudSwarm } else { LlmRoute::LocalAgent } } // Returns Vec<f32 representing the token activation states the embedding vector // rather than raw bytecode, laying the groundwork for semantic clustering in Part 10. pub fn route and generate query: &str, site map: &crate::nda jit::SiteMap - Result<Vec<f32 , &'static str { let route = classify query query ; match route { LlmRoute::CloudSwarm = { // Plan via high-capacity cloud swarm... generate bytecode from prompt &format "/ Cloud Swarm: {query} /" , site map } LlmRoute::LocalAgent = { // Direct generation via local model... generate bytecode from prompt query, site map } } } This security gate raised the floor for any model running through the pipeline. It was no longer about finding the most "secure" model—it was about building an infrastructure that forced security by construction. But as the agent continued generating files, I hit another wall: context bloat . The context accumulation of self-correction was costing me valuable seconds and tokens. In the next post, I'll detail how I tamed the context monster by inventing a new binary format and a multi-agent debate board. How are you all handling LLM "scope failures" in your local agents? Do you prefer prompt engineering or, like me, a hard-coded "Gatekeeper"? Have you noticed your LLM-generated code taking "security shortcuts" like this? I'd love to hear how you're validating AI output in your own pipelines Special thanks to , whose peer critique on scope failures pushed me to build this security gate rather than relying on prompt engineering. Disclaimer: AI was used throughout this project, it is just fitting that it would co-author with me, so special thanks to the Foundry for it's tireless hours toiling away and Gemini for producing the cover image.