I Built an LLM Gateway That Extends Claude Pro/Max Users with Azure AI Foundry, Amazon Bedrock, Local Models

A developer built Lynkr, an open-source LLM gateway that extends Claude Pro/Max usage by routing different coding tasks to appropriate models across Azure AI Foundry, Amazon Bedrock, and local models. The gateway aims to solve inefficiencies in AI coding tools by separating premium reasoning tasks from cheap, frequent requests, enabling cost optimization and provider fallback.

AI coding tools have gotten very good. But the infrastructure behind them is still weirdly inefficient. Most tools assume one provider, one lane, one billing path. That means the same expensive model or subscription ends up handling everything: That is the wrong abstraction. A coding workflow is not one type of problem. So it should not be forced through one type of model path. That idea is what pushed me to build Lynkr . Lynkr is an open-source LLM gateway for AI coding tools that lets me combine: behind one routing layer. If you use a premium coding assistant every day, you have probably seen this already. A lot of the workload is not actually premium reasoning work. For example: These are useful requests, but they are not the same as: Yet most tools send both classes of work through the same expensive path. That creates three problems: If a subscription-backed or premium model handles every tiny prompt, you burn good capacity on low-value tasks. Even if you already have access to Azure, AWS, or local models, your coding workflow is often tied to one vendor path. If one provider is rate-limited, degraded, or just not the best fit for a task, you have no routing layer to adjust. Lynkr sits between AI coding tools and model providers. It works as an LLM gateway , which means the coding tool talks to Lynkr, and Lynkr decides what to do next. That lets the gateway: The part I am most excited about is hybrid routing across: The simplest version looks like this: So instead of replacing Claude, Azure, or Bedrock, the gateway combines them. This is the key idea: extend your Claude Pro/Max usage instead of burning it on everything . Imagine a coding session that looks like this: "Read the auth middleware and summarize it." Route to a cheap local model. "Search all routes that call this helper." Still cheap/local. "Refactor this auth flow to support tenant isolation." Route to Claude Pro/Max. "Generate an enterprise-safe variant for our internal stack." Route to Azure AI Foundry. "Azure is unavailable or rate-limited." Fallback to Bedrock. That is a much more natural way to run coding agents than pretending every prompt deserves the same model path. This combination matters because each lane solves a different problem. Great for high-quality coding and reasoning tasks where you already have subscription value. Useful when a team wants enterprise-hosted models, internal approvals, or Azure-aligned infrastructure. Useful for AWS-native orgs, alternate model access, or fallback when you want another enterprise provider path. Useful for cheap, frequent, low-stakes tasks that should not consume premium capacity at all. Putting these together in one gateway gives you a better operational model than any one of them alone. I think coding is one of the best use cases for an LLM gateway because coding workflows are: That means a gateway can add value in several ways. Not every prompt deserves the same model. Cheap requests stay cheap. Premium capacity gets reserved for tasks that actually need it. Teams can use Azure AI Foundry or Bedrock where policy or procurement matters. If one provider path fails, the workflow can continue. Another reason this matters is MCP and agentic tooling. As coding tools become more agentic, they use more: That creates a lot of overhead and a lot of repeated context. A gateway is the right place to optimize that. That is also why I think the future is not just better models. It is better routing, caching, tool handling, and workload separation around those models. I did not want just another OpenAI-compatible endpoint. I wanted a gateway that could actually help with real coding economics and workflow design. For me, that means: I think this is especially useful for: I do not think the next big improvement in AI coding comes only from stronger base models. A lot of value will come from better infrastructure around them: That is the direction I am building toward with Lynkr . GitHub: https://github.com/Fast-Editor/Lynkr https://github.com/Fast-Editor/Lynkr Ps:- This is fully following Anthropic TOS because lynkr wraps around your existing claude code