{"slug": "lite-harness-sdk", "title": "Lite-Harness SDK", "summary": "LiteLLM launched the Lite-Harness SDK, enabling developers to swap between AI agent harnesses such as Claude Code and Codex without rewriting application code. The SDK provides a unified query interface, supports cost controls and logging via the LiteLLM AI Gateway, and allowed one team to save three weeks of integration work while A/B testing different harnesses in production.", "body_md": "AI harnesses are the new vendor lock-in. To swap across harnesses easily without rewriting your app, LiteLLM launched the **Lite-Harness SDK**.\n\nRun your prompt across different harnesses:\n\n``` python\nfrom lite_harness import query, AgentOptions\n\nprompt = \"Fix the failing test\"\n\n# Claude Code harness\nasync for message in query(\n    prompt=prompt,\n    options=AgentOptions(harness=\"claude-code\", model=\"claude-opus-4-8\"),\n):\n    print(message)\n\n# Codex harness\nasync for message in query(\n    prompt=prompt,\n    options=AgentOptions(harness=\"codex\", model=\"gpt-5.5\"),\n):\n    print(message)\n```\n\nTo enable cost controls, fallbacks, and logging, point it to your LiteLLM AI Gateway:\n\n```\nexport LITELLM_API_BASE=https://litellm.your-company.com/v1\nexport LITELLM_API_KEY=sk-litellm-...\n```\n\n**Engineer's Takeaway:**\n\nThis SDK unifies how you *invoke* the agents, not how they run internally. Each harness keeps its native loop and tool-calling semantics. It is perfect for A/B testing agent performance and centralizing costs, but remember it is in public beta, so custom tool injection might require extra work!\n\nMy team was building an internal bot to fix failing CI/CD tests. We had three engineers advocating for three different harnesses: one wanted Claude Code, another Codex, and another Pi AI. Without an abstraction layer, we would have had to maintain **three forks of the same bot**, with three different SDKs, three logging systems, and three ways to track costs. It would have been an impossible maintenance burden.\n\nThe SDK solved that exact pain point in **three concrete dimensions**:\n\nInstead of maintaining three separate implementations, I had **a single query()** that routed to whichever harness I wanted. Switching from Claude Code to Codex was literally just changing a string in the options. This allowed us to do real A/B testing in production for two weeks without rewriting any core logic.\n\nBy connecting it to the LiteLLM AI Gateway, I could suddenly see on a single dashboard:\n\nWithout the gateway, tracking the real cost of an agent (which makes multiple sequential tool calls) is a nightmare of scattered logs.\n\nWhen Anthropic released new capabilities in Claude Opus 4.8, I just updated the model string. I didn't have to touch the bot's underlying code. That's the real promise of LiteLLM: **decoupling your application from the provider**.\n\n`max_iterations`\n\n, an agent can burn $5 in tokens if it gets stuck in an infinite loop. I had to wrap the `query()`\n\ncall in an `asyncio.wait_for`\n\nwith a strict timeout to protect our budget.**Lite-Harness probably saved me 3 weeks of integration work** and gave me hard data to make an informed architecture decision. We ended up choosing Claude Code as our primary harness and Codex as a fallback for simpler, cost-sensitive tasks.", "url": "https://wpnews.pro/news/lite-harness-sdk", "canonical_source": "https://dev.to/jeancarlosn/lite-harness-sdk-3f28", "published_at": "2026-06-25 12:37:20+00:00", "updated_at": "2026-06-25 12:43:17.323295+00:00", "lang": "en", "topics": ["ai-tools", "developer-tools", "large-language-models", "ai-agents"], "entities": ["LiteLLM", "Claude Code", "Codex", "Anthropic", "Pi AI"], "alternates": {"html": "https://wpnews.pro/news/lite-harness-sdk", "markdown": "https://wpnews.pro/news/lite-harness-sdk.md", "text": "https://wpnews.pro/news/lite-harness-sdk.txt", "jsonld": "https://wpnews.pro/news/lite-harness-sdk.jsonld"}}