# BoxAgnts Introduction (7) — OpenAI API and Anthropic API

> Source: <https://dev.to/guyoung/boxagnts-introduction-7-openai-api-and-anthropic-api-11o7>
> Published: 2026-05-31 08:03:58+00:00

The 2025 AI model market is in full bloom. But each provider has its own API format, authentication method, and streaming protocol. BoxAgnts' design goal: **users switch models by changing just one parameter, with all internal logic remaining unchanged**.

This article dissects this abstraction across four levels:

`LlmProvider`

trait defines a "model provider"Everything starts with the interface definition:

```
// boxagnts-api/src/provider.rs
#[async_trait]
pub trait LlmProvider: Send + Sync {
    fn id(&self) -> &ProviderId;                              // Unique identifier
    fn name(&self) -> &str;                                   // Human-readable name

    async fn create_message(                                  // Non-streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<ProviderResponse, ProviderError>;

    async fn create_message_stream(                           // Streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<
        Pin<Box<dyn Stream<Item = Result<StreamEvent, ProviderError>> + Send>>,
        ProviderError,
    >;

    async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError>;  // Model list
    async fn check_connectivity(&self) -> Result<ProviderStatus, ProviderError>; // Health check
    fn capabilities(&self) -> ProviderCapabilities;           // Capability declaration
}
```

Both input and output use provider-agnostic unified types:

```
pub struct ProviderRequest {
    pub model: String,
    pub messages: Vec<Message>,          // Unified conversation format
    pub system_prompt: Option<SystemPrompt>,
    pub tools: Vec<ToolDefinition>,      // Unified tool definitions
    pub max_tokens: u32,
    pub temperature: Option<f64>,
    pub thinking: Option<ThinkingConfig>, // Deep thinking configuration
    pub provider_options: Value,          // Provider-specific parameters
}

pub struct ProviderResponse {
    pub id: String,
    pub content: Vec<ContentBlock>,      // Unified content blocks
    pub stop_reason: StopReason,         // Unified stop reason
    pub usage: UsageInfo,                // Token usage
    pub model: String,
}
```

The core value of the normalization layer: **whether the underlying is Claude, GPT, or Gemini, upper-layer code only sees ProviderRequest and ProviderResponse**.

```
// boxagnts-api/src/registry.rs
pub struct ProviderRegistry {
    providers: HashMap<ProviderId, Arc<dyn LlmProvider>>,
    default_provider_id: ProviderId,
}

fn provider_from_key(provider_id: &str, key: String) -> Option<Arc<dyn LlmProvider>> {
    match provider_id {
        // Native implementations — each with its own API format
        "anthropic" => Some(Arc::new(AnthropicProvider::from_config(...))),
        "openai"    => Some(Arc::new(OpenAiProvider::new(key))),
        "google"    => Some(Arc::new(GoogleProvider::new(key))),
        "github-copilot" => Some(Arc::new(CopilotProvider::new(key))),
        "cohere"    => Some(Arc::new(CohereProvider::new(key))),

        // OpenAI-compatible providers — share the same conversion logic, only change base_url
        "deepseek", "groq", "ollama", "mistral", "xai",
        "perplexity", "openrouter", "siliconflow", "moonshot",
        "zhipu", "stepfun", "fireworks", "llamacpp",
        "sambanova", "huggingface", "nvidia", "cerebras",
        // ... 30+ OpenAI-compatible providers in total
        _ => None,
    }
}
```

Three implementation strategies:

| Type | Representative | Conversion Strategy | Count |
|---|---|---|---|
Native Anthropic |
claude-sonnet-4-5 | Near-zero conversion (internal format = Anthropic format) | 1 |
Native OpenAI |
gpt-4o, o3 | ProviderRequest → Chat Completions | 1 |
Native Google |
gemini-2.5-flash | ProviderRequest → generateContent | 1 |
OpenAI Compatible |
deepseek, groq, ollama, etc. | Same logic as OpenAI, only URL changes | 30+ |
Other Native |
github-copilot, cohere | Independent format conversion | 3+ |

Anthropic, OpenAI, Google Gemini — three APIs with vast differences in message format. Understanding these differences is essential to understanding the value of the conversion layer.

| Feature | Anthropic | OpenAI | Google Gemini |
|---|---|---|---|
| Location | Top-level `"system"` field |
messages[0], `role:"system"`
|
Top-level `"systemInstruction"` field |
| Type | string or ContentBlock array | string only | content parts array only |

```
// Anthropic — top-level standalone field
{"model": "claude-sonnet-4-5", "system": "You are helpful.", "messages": [...]}

// OpenAI — embedded in messages array
{"model": "gpt-4o", "messages": [{"role":"system","content":"You are helpful."}, ...]}

// Google — uses systemInstruction field, structure differs from messages
{
  "systemInstruction": {"parts": [{"text": "You are helpful."}]},
  "contents": [{"role": "user", "parts": [{"text": "Hello"}]}]
}
```

| Feature | Anthropic | OpenAI | |
|---|---|---|---|
| Field | `"tools": [{name, description, input_schema}]` |
`"tools": [{type:"function", function:{...}}]` |
`"tools": [{functionDeclarations: [{name, description, parameters}]}]` |
| Wrapping Layers | 0 | 1 | 1, with different nesting names |

```
// Anthropic — native block in content array
{"content": [{"type":"tool_use", "id":"toolu_01A", "name":"read", "input": {...}}]}

// OpenAI — standalone tool_calls array, arguments is JSON string
{"tool_calls": [{"id":"call_abc", "function": {"name":"read", "arguments": "{\"path\":\"...\"}"}}]}

// Google — functionCall embedded in parts, args is JSON object
{"candidates": [{"content": {"parts": [{"functionCall": {"name":"read", "args": {...}}}]}}]}
// Anthropic — tool_result is a block in the user message content array
{"role":"user", "content": [{"type":"tool_result", "tool_use_id":"toolu_01A", "content":"..."}]}

// OpenAI — requires a separate role: "tool" message
{"role":"tool", "tool_call_id":"call_abc", "content":"..."}

// Google — functionResponse embedded in user content parts
{"role":"user", "parts": [{"functionResponse": {"name":"read", "response": {...}}}]}
```

| Anthropic | OpenAI | |
|---|---|---|
`user` |
`user` |
`user` |
`assistant` |
`assistant` |
`model` |

Google uses `model`

instead of `assistant`

— this is the most easily overlooked but most error-prone difference.

`OpenAiProvider`

is the most complete example of the conversion layer:

```
// boxagnts-api/src/providers/openai.rs
impl OpenAiProvider {
    fn to_openai_messages(
        messages: &[Message],
        system_prompt: Option<&SystemPrompt>,
    ) -> Vec<Value> {
        let mut result: Vec<Value> = Vec::new();

        // Step 1: system prompt → role: "system" message
        if let Some(sys) = system_prompt {
            result.push(json!({"role": "system", "content": sys_text}));
        }

        for msg in messages {
            match msg.role {
                Role::User => {
                    // User messages may mix text and tool_result blocks
                    // tool_result needs to be split into separate role: "tool" messages
                    Self::append_user_messages(&mut result, &msg.content);
                }
                Role::Assistant => {
                    let (text, tool_calls) = Self::assistant_content_to_openai(&msg.content);
                    result.push(json!({
                        "role": "assistant",
                        "content": text,
                        "tool_calls": tool_calls
                    }));
                }
            }
        }
        result
    }

    fn to_openai_tools(tools: &[ToolDefinition]) -> Vec<Value> {
        tools.iter().map(|td| {
            json!({
                "type": "function",
                "function": {
                    "name": td.name,
                    "description": td.description,
                    "parameters": td.input_schema
                }
            })
        }).collect()
    }
}
```

The most complex part is tool_use_id sanitization — Anthropic's tool IDs (e.g., `toolu_01Bx...`

) may contain characters that OpenAI does not accept.

`GoogleProvider`

shows how to handle an API format that is different from both Anthropic and OpenAI:

```
// boxagnts-api/src/providers/google.rs
// URL pattern completely different from OpenAI's /v1/chat/completions
fn generate_url(&self, model: &str) -> String {
    format!(
        "{}/v1beta/models/{}:generateContent?key={}",
        self.base_url, model, self.api_key  // API Key in URL query parameters!
    )
}
```

Key differences from OpenAI:

| Difference | Google Gemini | OpenAI |
|---|---|---|
| API Key Location | URL query parameter `?key=`
|
HTTP Header `Authorization: Bearer`
|
| Endpoint Format | `/v1beta/models/{model}:generateContent` |
`/v1/chat/completions` |
| Streaming Endpoint | `/v1beta/models/{model}:streamGenerateContent?alt=sse` |
`/v1/chat/completions` + `stream:true`
|
| Message Roles |
`user` / `model` (not assistant) |
`user` / `assistant`
|
| Tool Results |
`functionResponse` in parts |
Separate `role: tool` message |
| Image Input |
`inlineData` base64 |
`image_url` or content parts |

`ThinkingConfig`

is the normalized deep thinking configuration — but different providers handle it completely differently:

```
// Normalized configuration
pub struct ThinkingConfig {
    pub budget_tokens: u32,   // Thinking token budget
}

// When building ProviderRequest, decides whether to pass based on provider capabilities
let provider_request = ProviderRequest {
    // ...
    thinking: if caps.thinking {
        effective_thinking_budget
            .map(|b| ThinkingConfig::enabled(b))
    } else {
        None  // This provider doesn't support thinking, don't pass
    },
};
```

| Provider | Thinking Support | How It's Passed |
|---|---|---|
| Anthropic (Claude 3.5+) | ✓ | `"thinking": {"type": "enabled", "budget_tokens": N}` |
| Google (Gemini 2.5+) | ✓ | `"thinkingConfig": {"thinkingBudget": N}` |
| OpenAI (o1/o3 series) | Partial | Via `reasoning_effort` parameter |
| Other OpenAI Compatible | Mostly unsupported | Not passed |

At request construction time, `ProviderCapabilities`

declares each provider's capabilities:

```
pub struct ProviderCapabilities {
    pub thinking: bool,              // Whether deep thinking is supported
    pub prompt_caching: bool,        // Whether prompt caching is supported
    pub image_input: bool,           // Whether image input is supported
    pub native_tool_use: bool,       // Whether native tool calling exists
    pub supports_streaming: bool,    // Whether streaming responses are supported
    // ...
}
```

OpenAI-compatible providers' APIs are roughly compatible, but all have subtle differences. `ProviderQuirks`

handles these:

```
pub struct ProviderQuirks {
    /// Specific error message patterns for context overflow
    pub overflow_patterns: Vec<String>,
    /// Local services that don't require API Keys (e.g., Ollama, LM Studio)
    pub no_api_key_required: bool,
    /// Whether streaming responses include usage info
    pub include_usage_in_stream: bool,
    /// Providers like DeepSeek need the reasoning_content field
    pub reasoning_field: Option<String>,
}
```

For example, DeepSeek's streaming response returns reasoning content with a field name different from OpenAI's — adapted via `reasoning_field`

. Ollama's context overflow error message is `"exceeds the available context size"`

, while LM Studio's is `"greater than the context length"`

— adapted via `overflow_patterns`

.

Streaming responses are also completely different across the three APIs:

| Feature | Anthropic (SSE) | OpenAI (SSE) | Google (SSE) |
|---|---|---|---|
| Event Granularity | High: 6 event types (start/delta/stop × 2) | Low: each chunk is a complete delta | Medium: pushed by chunk, but structure is flat |
| Tool call Increment | Fragmented send of `input_json_delta`
|
Single send of complete `arguments` string |
Single send of complete `functionCall`
|
| Termination Signal |
`message_stop` event |
`data: [DONE]` marker |
Stream ends naturally |
| Need to Reassemble by index | Yes (reassemble by index for multiple tool_use) | Yes | Yes |

All three formats are normalized to the same `StreamEvent`

enum:

```
pub enum StreamEvent {
    MessageStart { id, model, usage },
    ContentBlockStart { index, content_block },
    TextDelta { text },
    ThinkingDelta { thinking },
    InputJsonDelta { index, partial_json },
    ContentBlockStop { index },
    MessageDelta { stop_reason, usage },
    MessageStop,
}
```

Each provider's error format is also different:

```
// Unified error types
pub enum ProviderError {
    Auth { ... },             // Authentication failure
    RateLimited { ... },      // Rate limiting
    ContextOverflow { ... },  // Context exceeds window (matched via ProviderQuirks)
    InvalidRequest { ... },   // Invalid request parameters
    ServerError { ... },      // Server error
    StreamError { ... },      // Stream interruption
    Other { ... },            // Unknown error
}
```

In the query loop, specific errors trigger specific recovery strategies:

```
RateLimited / Overloaded → Switch to fallback_model
ContextOverflow → Trigger auto_compact
StreamError (stall) → Retry (max 2 times, 45s timeout)
Auth → Unrecoverable, return error
```

BoxAgnts defines environment variable name mappings for each provider:

```
// boxagnts-workspace/src/config.rs
pub fn api_key_env_vars_for_provider(provider_id: &str) -> &'static [&'static str] {
    match provider_id {
        "anthropic" => &["ANTHROPIC_API_KEY"],
        "openai" => &["OPENAI_API_KEY"],
        "google" => &["GOOGLE_API_KEY", "GOOGLE_GENERATIVE_AI_API_KEY"],
        "deepseek" => &["DEEPSEEK_API_KEY"],
        "mistral" => &["MISTRAL_API_KEY"],
        "xai" => &["XAI_API_KEY"],
        "zhipu" => &["ZHIPU_API_KEY"],
        // ... 40+ provider environment variables
    }
}
```

Three-tier priority: `Environment Variables > User Config JSON > No Default`

. This design supports different scenarios such as multi-tenancy, CI/CD, and local development.

BoxAgnts' model abstraction layer solves the essential problem of "one set of code adapting to all APIs":

```
┌──────────────────────────────────────────────┐
│  boxagnts-query (Agent reasoning loop)        │
│  Only uses ProviderRequest / ProviderResponse │
└────────────────────┬─────────────────────────┘
                     │
┌────────────────────▼─────────────────────────┐
│  LlmProvider trait                            │
│  + ProviderRegistry (40+ providers)           │
├──────────┬──────────┬──────────┬─────────────┤
│Anthropic │ OpenAI   │ Google   │ OpenAiCompat │
│Provider  │ Provider │ Provider │ (30+ vendors)│
│(Near-zero│ (Full    │ (Independent│ (Shares    │
│ conversion)│ format  │ format    │ OpenAI      │
│          │ conversion)│ conversion)│ conversion  │
│          │          │          │ +Quirks)     │
└──────────┴──────────┴──────────┴─────────────┘
```

Three key capabilities:

`--model`

parameter`run_query_loop()`

has no idea what's underneathThis is not a simple "adapter pattern" — it's a production-grade abstraction validated against 40+ real-world APIs.
