{"slug": "how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs", "title": "How I built a simple AI router to avoid vendor lock-in and costs", "summary": "A developer built a simple AI router to avoid vendor lock-in and reduce costs by routing different tasks to the most appropriate AI model. The router uses a YAML config file to map tasks like question answering, image captioning, and summarization to specific providers and models, handling different SDKs and authentication. The solution eliminates manual API key swapping and reduces costs by using cheaper models for simpler tasks.", "body_md": "I've been working on a side project that needs AI for a few different tasks: answering user questions, generating image captions, and summarizing chat threads. At first, I just picked one provider (OpenAI) and called it a day. But after a month, two things became painfully clear: first, not every model is great at every task, and second, the bill was climbing fast because I was using GPT-4 for everything.\n\nSo I did what any reasonable developer would do: I started swapping API keys by hand. I'd comment out one import and uncomment another, deploy, test, get frustrated, rinse and repeat. That worked for about a week before I decided I needed a proper solution.\n\nMy project had three distinct AI needs:\n\nI was using one provider for all three, which meant I was either overpaying for simple tasks or getting low-quality results for complex ones.\n\nFirst, I tried a simple `if-elif`\n\nchain in every endpoint. That turned into spaghetti within hours. Then I tried a config file with model names, but I still had to handle different SDKs, authentication, and response formats manually. It was brittle and ugly.\n\nI also looked at some API aggregation services. They promised unified access but often introduced latency, added cost per call, or required me to trust their infrastructure with my keys. Not ideal for a small project where I wanted full control.\n\nI built a tiny Python class that acts as a router. It takes a task name, picks a provider and model from a config file, and handles the request. The key insight: I didn't need a full proxy — just a configurable dispatcher that I could plug into my existing code with minimal changes.\n\nHere's the core of it. First, the config file (`config/ai_router.yaml`\n\n):\n\n```\n# config/ai_router.yaml\nrouting:\n  qa:\n    provider: openai\n    model: gpt-4\n    max_tokens: 500\n    temperature: 0.2\n  captions:\n    provider: anthropic\n    model: claude-3-haiku-20240307\n    max_tokens: 200\n    temperature: 0.7\n  summarize:\n    provider: openai\n    model: gpt-3.5-turbo\n    max_tokens: 1000\n    temperature: 0.3\n```\n\nNow the router class (`router.py`\n\n):\n\n``` python\nimport os\nimport yaml\nfrom functools import lru_cache\n\nclass AIRouter:\n    def __init__(self, config_path=\"config/ai_router.yaml\"):\n        with open(config_path) as f:\n            self.config = yaml.safe_load(f)['routing']\n        self._init_providers()\n\n    def _init_providers(self):\n        # Lazy import to avoid loading unused SDKs\n        self.providers = {}\n\n        if any(cfg['provider'] == 'openai' for cfg in self.config.values()):\n            from openai import OpenAI\n            self.providers['openai'] = OpenAI(api_key=os.environ['OPENAI_API_KEY'])\n\n        if any(cfg['provider'] == 'anthropic' for cfg in self.config.values()):\n            from anthropic import Anthropic\n            self.providers['anthropic'] = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])\n\n    def complete(self, task: str, prompt: str):\n        cfg = self.config.get(task)\n        if not cfg:\n            raise ValueError(f\"Unknown task: {task}\")\n\n        provider = self.providers[cfg['provider']]\n        model = cfg['model']\n\n        if cfg['provider'] == 'openai':\n            response = provider.chat.completions.create(\n                model=model,\n                messages=[{\"role\": \"user\", \"content\": prompt}],\n                max_tokens=cfg['max_tokens'],\n                temperature=cfg['temperature']\n            )\n            return response.choices[0].message.content\n\n        elif cfg['provider'] == 'anthropic':\n            response = provider.messages.create(\n                model=model,\n                max_tokens=cfg['max_tokens'],\n                temperature=cfg['temperature'],\n                messages=[{\"role\": \"user\", \"content\": prompt}]\n            )\n            return response.content[0].text\n\n        else:\n            raise NotImplementedError(f\"Provider {cfg['provider']} not implemented\")\n```\n\nUsage in my app is dead simple:\n\n``` python\nfrom router import AIRouter\nrouter = AIRouter()\n\n# In one endpoint:\nanswer = router.complete('qa', \"What's the capital of France?\")\n# In another:\ncaption = router.complete('captions', \"Describe this image: [base64 data]\")\n```\n\nI'll be honest: this isn't production-grade. Error handling is minimal. If a provider is down, the whole request fails. There's no retry logic or fallback. Also, the config is static — if I want to switch models mid-request, I'd need a different approach.\n\nBut for my project, it solved the immediate pain: I can now route tasks to the most cost-effective model without touching code. I saved about 40% on API costs in the first month by sending captions to cheaper models.\n\nI'd add a fallback mechanism. For example, if `gpt-4`\n\nfails, try `gpt-3.5-turbo`\n\nbefore erroring out. Also, I'd make the router async — most providers support async now, and it would fit better in a web framework like FastAPI.\n\nAnother improvement: dynamic routing based on prompt length or complexity. For instance, if a Q&A prompt is short and simple, route it to a cheaper model automatically.\n\nIf you don't want to build this yourself, there are services that do something similar. For instance, `ai.interwestinfo.com`\n\noffers a unified API with smart routing. But for my small project, rolling my own taught me a lot about each provider's quirks. It also gave me full control over the routing logic.\n\nI'm still iterating on this. Next up: adding streaming support and a simple latency monitor.\n\nWhat does your AI infrastructure look like? Are you using a single provider or something more flexible? I'd love to hear how others handle this.", "url": "https://wpnews.pro/news/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs", "canonical_source": "https://dev.to/__c1b9e06dc90a7e0a676b/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs-2mbo", "published_at": "2026-06-27 08:00:53+00:00", "updated_at": "2026-06-27 08:03:48.798574+00:00", "lang": "en", "topics": ["artificial-intelligence", "developer-tools", "ai-products"], "entities": ["OpenAI", "Anthropic", "GPT-4", "Claude 3 Haiku", "GPT-3.5 Turbo"], "alternates": {"html": "https://wpnews.pro/news/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs", "markdown": "https://wpnews.pro/news/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs.md", "text": "https://wpnews.pro/news/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs.txt", "jsonld": "https://wpnews.pro/news/how-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs.jsonld"}}