An OpenAI-compatible base URL is supposed to make model switching boring: change the endpoint, keep the SDK, and move on. In real projects, the first run often fails with a 401
, 404
, 429
, or a model-not-found error.
Here is the checklist I use before blaming the SDK.
Most OpenAI-compatible gateways expect a /v1
prefix:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_RELAY_KEY",
base_url="https://api.wappkit.com/v1",
)
If you use only the domain, some SDK calls may resolve to the wrong path. Check the provider's docs and copy the exact base URL format.
A common mistake is mixing keys:
When you see 401 Unauthorized
, print the first and last few characters of the key locally and compare it with the dashboard. Do not log the full key.
Do not guess model names from memory. Gateway model names can change as upstream availability changes.
Before using gpt-5.5
, gpt-5.4
, or a Claude Code model, check the current model list. Copy the model id exactly.
resp = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
If the model name is wrong, you usually get 404
, model_not_found
, or a gateway-specific validation error.
Before debugging your whole app, run one tiny request:
resp = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "ping"}],
max_tokens=20,
)
print(resp.choices[0].message.content)
If this works, the base URL, key, and model are probably fine. Your bug is likely in the app layer: streaming, tool calling, message format, proxy settings, or retry logic.
401
usually means key or account state.
429
usually means rate limit, balance, or temporary traffic control.
If you get 429
, check the billing page and wait before retrying. A tight retry loop can make the problem worse.
When the same request worked yesterday and fails today, do not rewrite the integration first. Check the status page. If there is an upstream incident, your code may be fine.
This is especially useful with relay services because there is one more layer between your app and the model provider.
Save a minimal curl command in your project docs:
curl https://api.wappkit.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_RELAY_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "ping"}],
"max_tokens": 20
}'
When the app breaks, run the curl command first. If curl fails, debug account, gateway, model, or network. If curl works, debug your app.
OpenAI-compatible base URLs are simple once the basics are clean: exact /v1
endpoint, matching API key, live model name, small test request, billing check, status check, and one known-good curl command.