Fallback logic is easier to reason about when the application has one request shape and the model choice stays configurable. That is especially useful for teams testing different AI providers, latency profiles, or cost envelopes.
RouterBase gives developers an OpenAI-compatible API surface at https://routerbase.com/v1
, which makes it a good place to prototype fallback behavior before changing a larger production system.
const baseUrl = "https://routerbase.com/v1/chat/completions";
async function runPrompt(model, prompt) {
const response = await fetch(baseUrl, {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.ROUTERBASE_API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
model,
messages: [{ role: "user", content: prompt }]
})
});
if (!response.ok) {
throw new Error(`Model ${model} failed with ${response.status}`);
}
return response.json();
}
async function runWithFallback(prompt) {
const primary = process.env.ROUTERBASE_PRIMARY_MODEL || "google/gemini-2.5-flash";
const fallback = process.env.ROUTERBASE_FALLBACK_MODEL || "openai/gpt-4.1-mini";
try {
return await runPrompt(primary, prompt);
} catch (primaryError) {
console.warn(primaryError.message);
return runPrompt(fallback, prompt);
}
}
Start with a workflow where a fallback is useful but not risky:
For each run, record which model responded, whether fallback was used, total latency, and whether the output needed manual correction. That turns a routing experiment into something the team can evaluate instead of a vague model preference debate.