{"slug": "why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter", "title": "Why the Inline Harness Matters: Your Agent Control Plane Just Got Lighter", "summary": "LiteLLM Agent Platform shipped the inline harness, a shared OpenCode harness that eliminates the need for per-agent pods, reducing infrastructure overhead for teams running 5–15 agents. This feature allows production teams to deploy agent control planes with session persistence, budget enforcement, and audit trails without per-pod costs, lowering the barrier to adoption for engineering teams.", "body_md": "Production teams are running agent control planes now. But many hit a wall: they looked at per-pod sandbox requirements and decided the infrastructure overhead wasn't worth it.\n\nLast month, LiteLLM Agent Platform shipped the **inline harness**, and it changes that math entirely. This isn't a small feature—it's the difference between \"we'll try this later\" and \"we can run this in production today.\"\n\nLet me back up. When you run a coding agent (Claude Code, OpenCode, Cursor), you need:\n\nThe standard approach: one pod per agent or per-team. Clean isolation. Obvious deployment model. But for teams with 5–20 agents across an engineering org, that's 5–20 pods. Each pod carries:\n\nMany teams looked at this and said: \"Great control plane, but we're not running 20 pods for agents. We'll stick with direct Anthropic console access.\"\n\nThe inline harness is a shared, inline opencode harness that ships as a first-class option in the harness picker—no per-agent pod required. Skills, MCP tools, system prompts, and memory all carry over.\n\nThis means:\n\nYou get the control plane benefits (session persistence, budget enforcement, audit trails, team access, credential vault) without the per-pod cost.\n\nThe inline harness is the inflection point where production teams move from \"we'll manage agents in the console\" to \"we'll run them on our control plane.\"\n\n**For teams with 5–15 agents:**\n\nYou can now run all of them on a single OpenCode harness, shared across the team. Infrastructure cost drops from \"5 pods × baseline overhead × regions\" to \"one shared harness.\" Agents still have session isolation, memory, scheduled execution, and full LiteLLM governance.\n\n**For teams starting with agents:**\n\nYou're not forced to choose between \"lightweight (no control plane) or heavy (per-pod infrastructure).\" You start with the inline harness, get instant control plane benefits, and upgrade to per-pod if you hit the scaling ceiling.\n\n**For teams evaluating LiteLLM Agent Platform:**\n\nThe objection \"per-pod is too heavy for our org\" is now off the table. You can deploy a lightweight, shared harness inside 24 hours and gain visibility into all your agents immediately.\n\nLet's be concrete. Suppose you're an engineering team with three coding agents:\n\n**Without inline harness (old model):**\n\n**With inline harness:**\n\nThe inline harness is not a replacement for per-pod isolation when you need it. If you have:\n\nYou still have the per-pod option. The inline harness is the pragmatic default for teams with standard isolation needs.\n\nThis is how production infrastructure matures:\n\nLiteLLM Agent Platform is at step two. The inline harness removes the infrastructure objection, leaving only \"is this the right control plane for my team?\"\n\nIf you're evaluating agent control planes, test the inline harness first. Spend two days deploying it on your infrastructure. See what it means to have one place to create, run, and observe agents. Then decide if the per-pod model matters for your workload.\n\nBecause for most teams, it won't.\n\nThe inline harness is a small feature that unlocks a big change: production teams can now run agent control planes at scale without infrastructure complexity.\n\n**Paul Twist** — European AI engineer & technical writer. I turn messy AI infrastructure into practical guides developers can actually use. Berlin-based, focused on production agent systems and open infrastructure.\n\n**Tag your agent control plane evaluation** in the comments — what's holding you back from running agents on your infrastructure?", "url": "https://wpnews.pro/news/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter", "canonical_source": "https://dev.to/paultwist/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter-1732", "published_at": "2026-06-30 16:02:02+00:00", "updated_at": "2026-06-30 16:18:57.107510+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "developer-tools", "ai-products", "mlops"], "entities": ["LiteLLM Agent Platform", "OpenCode", "Claude Code", "Cursor", "Anthropic", "Paul Twist"], "alternates": {"html": "https://wpnews.pro/news/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter", "markdown": "https://wpnews.pro/news/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter.md", "text": "https://wpnews.pro/news/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter.txt", "jsonld": "https://wpnews.pro/news/why-the-inline-harness-matters-your-agent-control-plane-just-got-lighter.jsonld"}}