How to Use T3 Code With Claude Code and an Open-Source LLM Gateway

A developer describes setting up T3 Code with Claude Code and the open-source LLM gateway Lynkr to create a layered stack for coding agents. The architecture separates UX, coding behavior, and model traffic control, addressing cost, reliability, and flexibility issues that arise when agents are wired directly to a single provider. Lynkr sits under Claude Code to handle routing, caching, and provider switching for tool-heavy workflows.

If I were setting up T3 Code for serious daily use, the stack I would want looks like this: T3 Code ↓ Claude Code ↓ Lynkr ↓ Anthropic / OpenAI / Ollama / OpenRouter / Bedrock / Azure / Databricks That flow is interesting because each layer is doing a different job: That separation is the whole point. T3 Code gives me the UX I want. Claude Code gives me the coding behavior I want. Lynkr gives me control over how model traffic actually gets handled. That is a much better stack than treating the model layer as an afterthought. I also recorded a short walkthrough of this setup in action: If you want the faster visual version before reading the rest, start there. The architecture is the same: T3 Code ↓ Claude Code ↓ Lynkr ↓ Your actual model/provider T3 Code is interesting because it is not trying to become a new model or a new lab-specific harness. It is building a better way to work with coding agents people already use. That is a smarter product decision than trying to replace everything at once. Its current support includes: That means the value of T3 Code is not “one more coding assistant.” It is more like: That makes a lot of sense. But once you pick Claude Code as the coding agent inside that stack, the next problem becomes obvious: the model layer under Claude Code matters just as much as the top-level UX. Because once the agent is doing real work, cost and reliability stop being invisible plumbing. They become part of the product experience. Claude Code is a good example because it exposes the problem very clearly. A real Claude Code session does not look like a single “generate code” call. It looks more like: That creates a traffic pattern that is very different from plain chat: This is exactly why coding-agent workflows need a stronger model layer than “just point it directly at one provider.” Once Claude Code is being used as an actual coding agent, the model path underneath it becomes infrastructure. And infrastructure decisions compound. Direct setup is fine for testing. But it gets worse as the workflow becomes more serious. If Claude Code is always wired straight to one provider path, you get a few problems: That is usually false. Some steps are lightweight: Some steps are genuinely expensive: If those all hit the same expensive path, you overpay. Coding agents retry all the time. That is not a bug. That is how they work. But retries mean the same or almost-the-same context gets resent over and over. Without a caching layer or routing control, you keep paying full price for repeated work. The expensive part is often not the user’s prompt. It is everything around it: That is where a lot of token waste hides. Maybe today you want Claude for everything. Later maybe you want: If the setup is too tightly wired, those changes become more painful than they should be. Latency spikes, rate limits, auth weirdness, provider outages, degraded outputs — eventually you hit all of them. If there is no gateway layer, every one of those issues becomes a client-side problem. That is exactly the kind of thing I would rather solve once in the model layer. This is the mental model that makes sense to me. That is a clean stack. The interface stays separate from the agent. The agent stays separate from the gateway. The gateway stays separate from the providers. That separation is valuable because it lets each layer evolve independently. Lynkr is an open-source LLM gateway built for coding assistants, MCP-heavy workflows, and tool-heavy traffic. That last part matters. A lot of model-routing products talk about general-purpose requests. But coding traffic is different. It is noisier, more repetitive, and much more likely to carry large tool payloads. That is why the fit is real here. The role of Lynkr in this stack is not to replace Claude Code. It is to sit under Claude Code and decide how model traffic should actually be handled. That gives you a few levers that matter a lot in coding workflows. The biggest mistake people make with coding agents is asking the wrong question. They ask: “Which is the best coding model?” The more useful question is: “Which parts of my coding workflow actually deserve the expensive model?” That is what a gateway lets you answer. For example: That is a much better economic model than treating every Claude Code turn as if it deserves maximum spend. And once that logic sits in the gateway, you do not need to keep rebuilding it at the app layer. Coding agents repeat themselves constantly. The same instructions, the same repo background, similar prompts, similar recovery steps, similar tool outputs — they come up again and again. That means a caching layer is not a “nice optimization.” It is one of the biggest obvious wins in the stack. Lynkr’s current benchmark claims are the part that stand out here: That is exactly the kind of traffic Claude Code creates during real multi-step work. The point is not just lower cost. The point is lower cost and lower latency on repeated work. That compounds very quickly. This is one of the most under-discussed parts of coding-agent economics. People spend a lot of time comparing model prices, but a huge amount of waste comes from the payload shape itself. In coding workflows, the model is often seeing: That means reducing payload size is often just as important as picking the right provider. This is why gateway-level optimization makes sense. It is solving a real problem in the actual traffic pattern, not just shuffling providers around. This is maybe the biggest architectural reason I like this stack. If T3 Code points to Claude Code, and Claude Code points to Lynkr, then the top-level workflow can remain stable while the backend policy changes underneath. That means I can change: …without having to rethink the interface and workflow every time. That is a better long-term design. The UI layer should not be where I want model policy to live. There are plenty of steps in a coding workflow that can be handled locally or by a cheaper model path. There are also plenty of steps where I want a stronger cloud model. A gateway makes that hybrid model much easier. For example: That kind of setup is a lot harder to maintain cleanly when every client is wired directly. The point is not that T3 Code itself becomes the gateway. The point is that the stack stays layered: T3 Code ↓ Claude Code ↓ Lynkr ↓ Anthropic / OpenAI / Ollama / OpenRouter / Bedrock / Azure / Databricks That gives you: That is the shape I would trust more over time. If you are trying T3 Code casually, none of this matters much. But if you are actually using it for repeated coding workflows, then it starts to matter fast. Because daily coding-agent usage means: That is when the gateway stops being optional architecture theory and starts becoming the practical layer that controls cost and reliability. If I were using T3 Code with Claude Code , I would not want Claude Code wired directly to one backend forever. I would want: That feels like the right stack for where coding tools are going. Better UX at the top. Better agent behavior in the middle. Better economics and control underneath. If you want to check the projects: