How I built a simple AI router to avoid vendor lock-in and costs

wpnews.pro

cd /news/artificial-intelligence/how-i-built-a-simple-ai-router-to-av… · home › topics › artificial-intelligence › article

[ARTICLE · art-41645] src=dev.to ↗ pub=2026-06-27T08:00Z topic=artificial-intelligence verified=true sentiment=↑ positive

How I built a simple AI router to avoid vendor lock-in and costs

A developer built a simple AI router to avoid vendor lock-in and reduce costs by routing different tasks to the most appropriate AI model. The router uses a YAML config file to map tasks like question answering, image captioning, and summarization to specific providers and models, handling different SDKs and authentication. The solution eliminates manual API key swapping and reduces costs by using cheaper models for simpler tasks.

read4 min views1 publishedJun 27, 2026

I've been working on a side project that needs AI for a few different tasks: answering user questions, generating image captions, and summarizing chat threads. At first, I just picked one provider (OpenAI) and called it a day. But after a month, two things became painfully clear: first, not every model is great at every task, and second, the bill was climbing fast because I was using GPT-4 for everything.

So I did what any reasonable developer would do: I started swapping API keys by hand. I'd comment out one import and uncomment another, deploy, test, get frustrated, rinse and repeat. That worked for about a week before I decided I needed a proper solution.

My project had three distinct AI needs:

I was using one provider for all three, which meant I was either overpaying for simple tasks or getting low-quality results for complex ones.

First, I tried a simple if-elif

chain in every endpoint. That turned into spaghetti within hours. Then I tried a config file with model names, but I still had to handle different SDKs, authentication, and response formats manually. It was brittle and ugly.

I also looked at some API aggregation services. They promised unified access but often introduced latency, added cost per call, or required me to trust their infrastructure with my keys. Not ideal for a small project where I wanted full control.

I built a tiny Python class that acts as a router. It takes a task name, picks a provider and model from a config file, and handles the request. The key insight: I didn't need a full proxy — just a configurable dispatcher that I could plug into my existing code with minimal changes.

Here's the core of it. First, the config file (config/ai_router.yaml

routing:
  qa:
    provider: openai
    model: gpt-4
    max_tokens: 500
    temperature: 0.2
  captions:
    provider: anthropic
    model: claude-3-haiku-20240307
    max_tokens: 200
    temperature: 0.7
  summarize:
    provider: openai
    model: gpt-3.5-turbo
    max_tokens: 1000
    temperature: 0.3

Now the router class (router.py

import os
import yaml
from functools import lru_cache

class AIRouter:
    def __init__(self, config_path="config/ai_router.yaml"):
        with open(config_path) as f:
            self.config = yaml.safe_load(f)['routing']
        self._init_providers()

    def _init_providers(self):
        self.providers = {}

        if any(cfg['provider'] == 'openai' for cfg in self.config.values()):
            from openai import OpenAI
            self.providers['openai'] = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

        if any(cfg['provider'] == 'anthropic' for cfg in self.config.values()):
            from anthropic import Anthropic
            self.providers['anthropic'] = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])

    def complete(self, task: str, prompt: str):
        cfg = self.config.get(task)
        if not cfg:
            raise ValueError(f"Unknown task: {task}")

        provider = self.providers[cfg['provider']]
        model = cfg['model']

        if cfg['provider'] == 'openai':
            response = provider.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}],
                max_tokens=cfg['max_tokens'],
                temperature=cfg['temperature']
            )
            return response.choices[0].message.content

        elif cfg['provider'] == 'anthropic':
            response = provider.messages.create(
                model=model,
                max_tokens=cfg['max_tokens'],
                temperature=cfg['temperature'],
                messages=[{"role": "user", "content": prompt}]
            )
            return response.content[0].text

        else:
            raise NotImplementedError(f"Provider {cfg['provider']} not implemented")

Usage in my app is dead simple:

from router import AIRouter
router = AIRouter()

answer = router.complete('qa', "What's the capital of France?")
caption = router.complete('captions', "Describe this image: [base64 data]")

I'll be honest: this isn't production-grade. Error handling is minimal. If a provider is down, the whole request fails. There's no retry logic or fallback. Also, the config is static — if I want to switch models mid-request, I'd need a different approach.

But for my project, it solved the immediate pain: I can now route tasks to the most cost-effective model without touching code. I saved about 40% on API costs in the first month by sending captions to cheaper models.

I'd add a fallback mechanism. For example, if gpt-4

fails, try gpt-3.5-turbo

before erroring out. Also, I'd make the router async — most providers support async now, and it would fit better in a web framework like FastAPI.

Another improvement: dynamic routing based on prompt length or complexity. For instance, if a Q&A prompt is short and simple, route it to a cheaper model automatically.

If you don't want to build this yourself, there are services that do something similar. For instance, ai.interwestinfo.com

offers a unified API with smart routing. But for my small project, rolling my own taught me a lot about each provider's quirks. It also gave me full control over the routing logic.

I'm still iterating on this. Next up: adding streaming support and a simple latency monitor.

What does your AI infrastructure look like? Are you using a single provider or something more flexible? I'd love to hear how others handle this.

source & further reading

dev.to — original article UK Healthcare AI Is Running Wild — And The Industry Wants It Fixed AI Table Generator Features Worth Actually Using Prompt Engineering: The Skill That Makes AI Work Better

~/api · this article 200

$curl api.wpnews.pro/v1/news/how-i-built-a-simple-ai-…

Read original on dev.to → dev.to/__c1b9e06dc90a7e0a676b/how-i-built-a-simp…

mentioned entities

OpenAI

Anthropic

GPT-4

Claude 3 Haiku

GPT-3.5 Turbo

metadata

slughow-i-built-a-simple-ai-router-to-avoid-vendor-lock-in-and-costs

topic#artificial-intelligence

secondary2 topics

sentimentpositive

canonicaldev.to

navigation

← prevUK Healthcare AI Is Running Wild…

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 27 Jun · #artificial-intelligence

I Tracked Every API Dollar Across 184 Models: Here's The Data

dev.to · 27 Jun · #artificial-intelligence

DeepSeek vs Qwen vs Kimi vs GLM: Which AI API Wins in 2025?

theregister.com · 27 Jun · #artificial-intelligence

It's looking like a hot, messy summer for security teams as AI finds countless previously hidden vulns

openclawlaunch.com · 27 Jun · #artificial-intelligence

Show HN: OpenClaw Launch – deploy a managed OpenClaw AI agent in 30s

── more on @openai 3 stories trending now

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

wpnews · 26 Jun · #large-language-models

The Wrapper Got Heavy: Why ChatGPT Clones Are Runtime Problems Now

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required