# "I Stopped Pretending Every AI Provider Was the Same"

> Source: <https://dev.to/codekingai/i-stopped-pretending-every-ai-provider-was-the-same-18k8>
> Published: 2026-06-20 12:10:47+00:00

The easiest way to make an AI gateway feel flaky is to pretend every upstream model works the same way.

On paper, a lot of tools look compatible.

They all take a prompt. They all return text. Some of them even share an OpenAI-shaped API.

In practice, the differences show up exactly where users stop forgiving you:

That was one of the most useful lessons while building [CliGate](https://github.com/codeking-ai/cligate), my local control plane for Claude Code, Codex CLI, Gemini CLI, OpenClaw, a resident assistant, and multiple model/account sources behind one localhost entrypoint.

The bug was subtler than that.

Routing often *did* succeed. A request got sent somewhere. A response came back. Nothing obviously crashed.

But that did not mean the gateway was correct.

If you route different tools and providers as if they were interchangeable, you get a class of failures that are hard to spot from logs alone:

That is not just transport routing.

That is capability routing.

At first, it is tempting to think routing is just:

``` php
pick provider -> send request
```

That model is too small.

What actually mattered in CliGate was closer to this:

``` php
identify caller/tool
-> identify protocol shape
-> resolve provider/model source
-> apply capability profile
-> translate or degrade fields safely
-> send upstream
```

A provider being reachable is not enough.

It also needs to be treated according to the features it really supports.

One of the more useful internal lessons in this project is that protocol translation is not a separate cleanup step after routing.

It *is* part of routing.

Some paths can accept a richer request shape. Some need fields normalized or stripped before the request becomes a silent bug.

That changed the safe mental model from:

“upstream did not complain, so the route must be fine.”

to:

“this route supports a specific capability profile, so normalize on purpose.”

That sounds small, but it prevents a lot of “works sometimes” behavior.

This is the trap.

Lots of systems advertise compatibility because they accept a familiar endpoint shape.

But compatibility at the HTTP layer is only the beginning.

If one tool expects richer reasoning or metadata semantics and another backend treats those fields differently, the gateway has three bad choices:

Only the third one scales.

That is why I now prefer capability-aware routing over a universal passthrough design.

`claude-code`

, `codex`

, `gemini-cli`

, `openclaw`

, and generic OpenAI/Anthropic-compatible clients may hit similar-looking routes, but they are not interchangeable from an operator’s perspective.

The user is often really asking for one of these:

That is why app-aware routing and capability-aware translation ended up being complementary, not separate concerns.

One decides **who this request is for**.

The other decides **how to make it truthful on the way through**.

The worst failures are the accidental ones.

If a gateway quietly forwards a field that the destination ignores, the user may never know why results became inconsistent.

So I started preferring explicit degradation rules.

If a route cannot honor a field, normalize it on purpose.

If a provider cannot match a capability, map it honestly.

If a model source is rate-limited or invalid, skip it instead of pretending all active-looking credentials are equal.

That gives me a much better operator story:

A good gateway should hide repetitive setup work.

It should not lie about capability differences.

Once I accepted that, the architecture became cleaner:

That is less magical, but much more dependable.

If I were designing another AI gateway tomorrow, I would keep these rules:

That is the direction I have been pushing with CliGate.

The project still aims to give me one local place for model routing, accounts, API keys, local runtimes, channels, runtime sessions, and an assistant layer.

But the system became much more trustworthy once I stopped pretending every upstream provider was the same.

If you run multiple AI tools through one gateway, are you doing plain endpoint routing, or routing by actual capability too?
