{"slug": "we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in", "title": "We built a coding harness that beats frontier models using open ones. It's in open beta.", "summary": "Backboard has launched Backboard Development Studio, an open beta coding harness that uses a memory-first architecture to outperform frontier AI models while relying on open-source models. The system, which ranks #1 on the LoCoMo and LongMemEval benchmarks, provides stateful persistence, routing across 17,000+ models, and recursive tool-calling through a single unified API.", "body_md": "Here is the bet we made: build software **memory-first, not model-first**, and it will outperform.\n\nEveryone else is racing to wrap the next model. We did the opposite. We built the memory layer first, the routing first, tool-calling, now the recursive engine, then let the model be a swappable part.\n\nToday that bet has a name: ** Backboard Development Studio**. It starts with the\n\nThe headline result? It beats frontier models using open ones. Keep reading, the numbers are below and there is a promo code at the bottom.\n\nThe beta is open. Two lines and you are running.\n\n```\n# macOS / Linux\ncurl -fsSL https://app.backboard.io/api/cli | bash\n\n# Windows (PowerShell)\nirm https://app.backboard.io/api/cli/windows | iex\n```\n\nGet your API key: [https://app.backboard.io](https://app.backboard.io)\n\nPromo code: ** DEVTOCLI** for credit toward inference while you put it through its paces. Find the Promo submit in the top right corner of the billing page.\n\nModel-first thinking says: pick the smartest model, prompt it well, hope it remembers.\n\nMemory-first thinking says: give the system real persistence, real routing, real recall, and a \"smaller\" model will outwork a \"smarter\" one that forgets everything between turns.\n\nWe believed the second one. So we built it. The R-CLI is powered by our memory algorithms (the same ones that rank **#1 on LoCoMo and LongMemEval**) and runs on Backboard's unified API: memory, routing across **17,000+ models**, RAG, and stateful threads behind one key.\n\nThen we tested it in public. That part did not go quietly.\n\nRead that second line again. An open model, inside our harness, posting numbers that go toe to toe with Claude Code, at a fraction of the cost.\n\nAnd to be clear: we are **not** the cheap open-source alternative. We run the full frontier lineup too. We just happen to beat frontier results with open models like GLM 5.1 and DeepSeek V4. Same harness, your choice of brain.\n\nYou do not have to pick one model. You can use two in a single task.\n\nTry ** /expert mode**:\n\nThe expensive model architects. The fast cheap one ships. The harness orchestrates the handoff. Frontier reasoning where it counts, frontier-beating cost where it does not. One command.\n\nNobody else is selling that, because nobody else built memory and routing first.\n\nWe launched. A serious builder showed up in the comments and pushed back hard.\n\nWell-tooled local repo. His own RAG, skills, memory, a knowledge graph he had clearly invested months in. He ran the CLI and came back with a fair verdict: \"kind of specific, not super helpful for a setup like mine.\"\n\nSerious builder. Serious objection. The strongest one a developer can make: **\"I already hand-built the thing you are selling.\"**\n\nThen one fact flipped the whole conversation.\n\n**The R-CLI is stateful by default.**\n\nThe persistence he was hand-building? The session-priming file he writes and re-reads every time? The weekly cron jobs auditing how often his agents drift? The pre-commit hooks keeping them on the rails?\n\nNative on our side. Not a layer you bolt on. The default behavior. That is what memory-first actually means in your terminal.\n\nSo for him it was never \"adopt a whole new ecosystem.\" It was a harness swap: keep your own RAG, memory, and graph, drop the maintenance tax.\n\nThe thread went from \"not for me\" to \"let me talk to your CLI lead.\" A demo call got booked. The objection did not get argued away. It got dissolved by a capability he did not know was there.\n\nThe lesson we took: the pitch was never \"we are better.\" It was \"you are doing by hand what we do by default.\" A developer handed us that line for free.\n\n**Best in the world.** Performance is the bar, not a tagline. We ran benchmarks internally because we expect to be measured.\n\n**Easiest to use.** One key. The same key for your R-CLI... well it unlocks: Memory, routing, multi-agent, parallel tool calls, all behind one integrated surface. No stitching eight services together and praying the glue holds.\n\n**Most accessible.** Frontier coding quality, your choice of model to get there. Closed, open, or mixed in one workflow. GLM 5.1 and DeepSeek V4 are the proof, not the promise.\n\n**People stay by choice.** Any model, your own embeddings, modular layers, your data exportable through real endpoints. No lock-in, no theatrics, no fear-mongering. If you stay, it is because the flexibility is unrivaled.\n\nThe R-CLI is the first surface of Backboard Development Studio. The IDE is close.\n\nSame engine, same performance, plus multi-agent sessions, Pi extension integrations, and coding-theme skills pre-built. The CLI is the foundation. We nail the harness with the community first. Then the IDE lands on something already proven.\n\nThe best feedback we have gotten so far came from someone telling us we were wrong. He pushed, we answered, he booked a call, his team switched.\n\nSo: paste the command, claim your key, run ** DEVTOCLI**, and try to break it. Then drop a comment with what held up, what did not, and what your current setup still does better.\n\nMemory-first or model-first. We made our bet. Come test it.\n\n*Backboard.io is full-stack, model-agnostic AI infrastructure. Backboard Development Studio is our recursive coding environment, stateful by default, built on the unified API.*", "url": "https://wpnews.pro/news/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in", "canonical_source": "https://dev.to/jon_at_backboardio/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-its-in-open-beta-15g3", "published_at": "2026-06-06 21:42:46+00:00", "updated_at": "2026-06-06 22:12:18.847029+00:00", "lang": "en", "topics": ["ai-tools", "ai-products", "ai-startups", "ai-agents", "ai-infrastructure"], "entities": ["Backboard Development Studio", "Backboard", "R-CLI", "LoCoMo", "LongMemEval"], "alternates": {"html": "https://wpnews.pro/news/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in", "markdown": "https://wpnews.pro/news/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in.md", "text": "https://wpnews.pro/news/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in.txt", "jsonld": "https://wpnews.pro/news/we-built-a-coding-harness-that-beats-frontier-models-using-open-ones-it-s-in.jsonld"}}