BoxAgnts Introduction (7) — OpenAI API and Anthropic API

wpnews.pro

The 2025 AI model market is in full bloom. But each provider has its own API format, authentication method, and streaming protocol. BoxAgnts' design goal: users switch models by changing just one parameter, with all internal logic remaining unchanged.

This article dissects this abstraction across four levels:

LlmProvider

trait defines a "model provider"Everything starts with the interface definition:

// boxagnts-api/src/provider.rs
#[async_trait]
pub trait LlmProvider: Send + Sync {
    fn id(&self) -> &ProviderId;                              // Unique identifier
    fn name(&self) -> &str;                                   // Human-readable name

    async fn create_message(                                  // Non-streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<ProviderResponse, ProviderError>;

    async fn create_message_stream(                           // Streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<
        Pin<Box<dyn Stream<Item = Result<StreamEvent, ProviderError>> + Send>>,
        ProviderError,
    >;

    async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError>;  // Model list
    async fn check_connectivity(&self) -> Result<ProviderStatus, ProviderError>; // Health check
    fn capabilities(&self) -> ProviderCapabilities;           // Capability declaration
}

Both input and output use provider-agnostic unified types:

pub struct ProviderRequest {
    pub model: String,
    pub messages: Vec<Message>,          // Unified conversation format
    pub system_prompt: Option<SystemPrompt>,
    pub tools: Vec<ToolDefinition>,      // Unified tool definitions
    pub max_tokens: u32,
    pub temperature: Option<f64>,
    pub thinking: Option<ThinkingConfig>, // Deep thinking configuration
    pub provider_options: Value,          // Provider-specific parameters
}

pub struct ProviderResponse {
    pub id: String,
    pub content: Vec<ContentBlock>,      // Unified content blocks
    pub stop_reason: StopReason,         // Unified stop reason
    pub usage: UsageInfo,                // Token usage
    pub model: String,
}

The core value of the normalization layer: whether the underlying is Claude, GPT, or Gemini, upper-layer code only sees ProviderRequest and ProviderResponse.

// boxagnts-api/src/registry.rs
pub struct ProviderRegistry {
    providers: HashMap<ProviderId, Arc<dyn LlmProvider>>,
    default_provider_id: ProviderId,
}

fn provider_from_key(provider_id: &str, key: String) -> Option<Arc<dyn LlmProvider>> {
    match provider_id {
        // Native implementations — each with its own API format
        "anthropic" => Some(Arc::new(AnthropicProvider::from_config(...))),
        "openai"    => Some(Arc::new(OpenAiProvider::new(key))),
        "google"    => Some(Arc::new(GoogleProvider::new(key))),
        "github-copilot" => Some(Arc::new(CopilotProvider::new(key))),
        "cohere"    => Some(Arc::new(CohereProvider::new(key))),

        // OpenAI-compatible providers — share the same conversion logic, only change base_url
        "deepseek", "groq", "ollama", "mistral", "xai",
        "perplexity", "openrouter", "siliconflow", "moonshot",
        "zhipu", "stepfun", "fireworks", "llamacpp",
        "sambanova", "huggingface", "nvidia", "cerebras",
        // ... 30+ OpenAI-compatible providers in total
        _ => None,
    }
}

Three implementation strategies:

Type	Representative	Conversion Strategy
Native Anthropic
claude-sonnet-4-5	Near-zero conversion (internal format = Anthropic format)	1
Native OpenAI
gpt-4o, o3	ProviderRequest → Chat Completions	1
Native Google
gemini-2.5-flash	ProviderRequest → generateContent	1
OpenAI Compatible
deepseek, groq, ollama, etc.	Same logic as OpenAI, only URL changes	30+
Other Native
github-copilot, cohere	Independent format conversion	3+

Anthropic, OpenAI, Google Gemini — three APIs with vast differences in message format. Understanding these differences is essential to understanding the value of the conversion layer.

Feature	Anthropic	OpenAI	Google Gemini
Location	Top-level `"system"` field
messages[0], `role:"system"`

Top-level `"systemInstruction"` field
Type	string or ContentBlock array	string only	content parts array only

// Anthropic — top-level standalone field
{"model": "claude-sonnet-4-5", "system": "You are helpful.", "messages": [...]}

// OpenAI — embedded in messages array
{"model": "gpt-4o", "messages": [{"role":"system","content":"You are helpful."}, ...]}

// Google — uses systemInstruction field, structure differs from messages
{
  "systemInstruction": {"parts": [{"text": "You are helpful."}]},
  "contents": [{"role": "user", "parts": [{"text": "Hello"}]}]
}

Feature	Anthropic	OpenAI
Field	`"tools": [{name, description, input_schema}]`
`"tools": [{type:"function", function:{...}}]`
`"tools": [{functionDeclarations: [{name, description, parameters}]}]`
Wrapping Layers	0	1	1, with different nesting names

// Anthropic — native block in content array
{"content": [{"type":"tool_use", "id":"toolu_01A", "name":"read", "input": {...}}]}

// OpenAI — standalone tool_calls array, arguments is JSON string
{"tool_calls": [{"id":"call_abc", "function": {"name":"read", "arguments": "{\"path\":\"...\"}"}}]}

// Google — functionCall embedded in parts, args is JSON object
{"candidates": [{"content": {"parts": [{"functionCall": {"name":"read", "args": {...}}}]}}]}
// Anthropic — tool_result is a block in the user message content array
{"role":"user", "content": [{"type":"tool_result", "tool_use_id":"toolu_01A", "content":"..."}]}

// OpenAI — requires a separate role: "tool" message
{"role":"tool", "tool_call_id":"call_abc", "content":"..."}

// Google — functionResponse embedded in user content parts
{"role":"user", "parts": [{"functionResponse": {"name":"read", "response": {...}}}]}

Anthropic	OpenAI
`user`
`user`
`user`
`assistant`
`assistant`
`model`

Google uses model

instead of assistant

— this is the most easily overlooked but most error-prone difference.

OpenAiProvider

is the most complete example of the conversion layer:

// boxagnts-api/src/providers/openai.rs
impl OpenAiProvider {
    fn to_openai_messages(
        messages: &[Message],
        system_prompt: Option<&SystemPrompt>,
    ) -> Vec<Value> {
        let mut result: Vec<Value> = Vec::new();

        // Step 1: system prompt → role: "system" message
        if let Some(sys) = system_prompt {
            result.push(json!({"role": "system", "content": sys_text}));
        }

        for msg in messages {
            match msg.role {
                Role::User => {
                    // User messages may mix text and tool_result blocks
                    // tool_result needs to be split into separate role: "tool" messages
                    Self::append_user_messages(&mut result, &msg.content);
                }
                Role::Assistant => {
                    let (text, tool_calls) = Self::assistant_content_to_openai(&msg.content);
                    result.push(json!({
                        "role": "assistant",
                        "content": text,
                        "tool_calls": tool_calls
                    }));
                }
            }
        }
        result
    }

    fn to_openai_tools(tools: &[ToolDefinition]) -> Vec<Value> {
        tools.iter().map(|td| {
            json!({
                "type": "function",
                "function": {
                    "name": td.name,
                    "description": td.description,
                    "parameters": td.input_schema
                }
            })
        }).collect()
    }
}

The most complex part is tool_use_id sanitization — Anthropic's tool IDs (e.g., toolu_01Bx...

) may contain characters that OpenAI does not accept.

GoogleProvider

shows how to handle an API format that is different from both Anthropic and OpenAI:

// boxagnts-api/src/providers/google.rs
// URL pattern completely different from OpenAI's /v1/chat/completions
fn generate_url(&self, model: &str) -> String {
    format!(
        "{}/v1beta/models/{}:generateContent?key={}",
        self.base_url, model, self.api_key  // API Key in URL query parameters!
    )
}

Key differences from OpenAI:

Difference	Google Gemini	OpenAI
API Key Location	URL query parameter `?key=`

HTTP Header `Authorization: Bearer`

Endpoint Format	`/v1beta/models/{model}:generateContent`
`/v1/chat/completions`
Streaming Endpoint	`/v1beta/models/{model}:streamGenerateContent?alt=sse`
`/v1/chat/completions` + `stream:true`

Message Roles
`user` / `model` (not assistant)
`user` / `assistant`

Tool Results
`functionResponse` in parts
Separate `role: tool` message
Image Input
`inlineData` base64
`image_url` or content parts

ThinkingConfig

is the normalized deep thinking configuration — but different providers handle it completely differently:

// Normalized configuration
pub struct ThinkingConfig {
    pub budget_tokens: u32,   // Thinking token budget
}

// When building ProviderRequest, decides whether to pass based on provider capabilities
let provider_request = ProviderRequest {
    // ...
    thinking: if caps.thinking {
        effective_thinking_budget
            .map(|b| ThinkingConfig::enabled(b))
    } else {
        None  // This provider doesn't support thinking, don't pass
    },
};

Provider	Thinking Support	How It's Passed
Anthropic (Claude 3.5+)	✓	`"thinking": {"type": "enabled", "budget_tokens": N}`
Google (Gemini 2.5+)	✓	`"thinkingConfig": {"thinkingBudget": N}`
OpenAI (o1/o3 series)	Partial	Via `reasoning_effort` parameter
Other OpenAI Compatible	Mostly unsupported	Not passed

At request construction time, ProviderCapabilities

declares each provider's capabilities:

pub struct ProviderCapabilities {
    pub thinking: bool,              // Whether deep thinking is supported
    pub prompt_caching: bool,        // Whether prompt caching is supported
    pub image_input: bool,           // Whether image input is supported
    pub native_tool_use: bool,       // Whether native tool calling exists
    pub supports_streaming: bool,    // Whether streaming responses are supported
    // ...
}

OpenAI-compatible providers' APIs are roughly compatible, but all have subtle differences. ProviderQuirks

handles these:

pub struct ProviderQuirks {
    /// Specific error message patterns for context overflow
    pub overflow_patterns: Vec<String>,
    /// Local services that don't require API Keys (e.g., Ollama, LM Studio)
    pub no_api_key_required: bool,
    /// Whether streaming responses include usage info
    pub include_usage_in_stream: bool,
    /// Providers like DeepSeek need the reasoning_content field
    pub reasoning_field: Option<String>,
}

For example, DeepSeek's streaming response returns reasoning content with a field name different from OpenAI's — adapted via reasoning_field

. Ollama's context overflow error message is "exceeds the available context size"

, while LM Studio's is "greater than the context length"

— adapted via overflow_patterns

.

Streaming responses are also completely different across the three APIs:

Feature	Anthropic (SSE)	OpenAI (SSE)	Google (SSE)
Event Granularity	High: 6 event types (start/delta/stop × 2)	Low: each chunk is a complete delta	Medium: pushed by chunk, but structure is flat
Tool call Increment	Fragmented send of `input_json_delta`

Single send of complete `arguments` string
Single send of complete `functionCall`

Termination Signal
`message_stop` event
`data: [DONE]` marker
Stream ends naturally
Need to Reassemble by index	Yes (reassemble by index for multiple tool_use)	Yes	Yes

All three formats are normalized to the same StreamEvent

enum:

pub enum StreamEvent {
    MessageStart { id, model, usage },
    ContentBlockStart { index, content_block },
    TextDelta { text },
    ThinkingDelta { thinking },
    InputJsonDelta { index, partial_json },
    ContentBlockStop { index },
    MessageDelta { stop_reason, usage },
    MessageStop,
}

Each provider's error format is also different:

// Unified error types
pub enum ProviderError {
    Auth { ... },             // Authentication failure
    RateLimited { ... },      // Rate limiting
    ContextOverflow { ... },  // Context exceeds window (matched via ProviderQuirks)
    InvalidRequest { ... },   // Invalid request parameters
    ServerError { ... },      // Server error
    StreamError { ... },      // Stream interruption
    Other { ... },            // Unknown error
}

In the query loop, specific errors trigger specific recovery strategies:

RateLimited / Overloaded → Switch to fallback_model
ContextOverflow → Trigger auto_compact
StreamError (stall) → Retry (max 2 times, 45s timeout)
Auth → Unrecoverable, return error

BoxAgnts defines environment variable name mappings for each provider:

// boxagnts-workspace/src/config.rs
pub fn api_key_env_vars_for_provider(provider_id: &str) -> &'static [&'static str] {
    match provider_id {
        "anthropic" => &["ANTHROPIC_API_KEY"],
        "openai" => &["OPENAI_API_KEY"],
        "google" => &["GOOGLE_API_KEY", "GOOGLE_GENERATIVE_AI_API_KEY"],
        "deepseek" => &["DEEPSEEK_API_KEY"],
        "mistral" => &["MISTRAL_API_KEY"],
        "xai" => &["XAI_API_KEY"],
        "zhipu" => &["ZHIPU_API_KEY"],
        // ... 40+ provider environment variables
    }
}

Three-tier priority: Environment Variables > User Config JSON > No Default

. This design supports different scenarios such as multi-tenancy, CI/CD, and local development.

BoxAgnts' model abstraction layer solves the essential problem of "one set of code adapting to all APIs":

┌──────────────────────────────────────────────┐
│  boxagnts-query (Agent reasoning loop)        │
│  Only uses ProviderRequest / ProviderResponse │
└────────────────────┬─────────────────────────┘
                     │
┌────────────────────▼─────────────────────────┐
│  LlmProvider trait                            │
│  + ProviderRegistry (40+ providers)           │
├──────────┬──────────┬──────────┬─────────────┤
│Anthropic │ OpenAI   │ Google   │ OpenAiCompat │
│Provider  │ Provider │ Provider │ (30+ vendors)│
│(Near-zero│ (Full    │ (Independent│ (Shares    │
│ conversion)│ format  │ format    │ OpenAI      │
│          │ conversion)│ conversion)│ conversion  │
│          │          │          │ +Quirks)     │
└──────────┴──────────┴──────────┴─────────────┘

Three key capabilities:

--model

parameterrun_query_loop()

has no idea what's underneathThis is not a simple "adapter pattern" — it's a production-grade abstraction validated against 40+ real-world APIs.

source & further reading

dev.to — original article Merge Concurrent Agent Patches by Base Commit and Hunk Ownership Show What an AI Agent Did Not Inspect Before Asking for Review Build a Bounded JSON Repair Loop for LLM Output in Python

BoxAgnts Introduction (7) — OpenAI API and Anthropic API

Run your AI side-project on zahid.host