cd /news/ai-agents/boxagnts-introduction-7-openai-api-a… · home topics ai-agents article
[ARTICLE · art-19033] src=dev.to pub= topic=ai-agents verified=true sentiment=· neutral

BoxAgnts Introduction (7) — OpenAI API and Anthropic API

BoxAgnts has built a model-agnostic abstraction layer that lets users switch between AI providers—including OpenAI, Anthropic, and Google Gemini—by changing a single parameter. The system normalizes each provider's unique API format, authentication, and streaming protocol into unified `ProviderRequest` and `ProviderResponse` types, supporting over 30 providers through native and OpenAI-compatible implementations. The `LlmProvider` trait and `ProviderRegistry` handle all internal conversion logic, enabling seamless model swapping without altering upper-layer code.

read9 min publishedMay 31, 2026

The 2025 AI model market is in full bloom. But each provider has its own API format, authentication method, and streaming protocol. BoxAgnts' design goal: users switch models by changing just one parameter, with all internal logic remaining unchanged.

This article dissects this abstraction across four levels:

LlmProvider

trait defines a "model provider"Everything starts with the interface definition:

// boxagnts-api/src/provider.rs
#[async_trait]
pub trait LlmProvider: Send + Sync {
    fn id(&self) -> &ProviderId;                              // Unique identifier
    fn name(&self) -> &str;                                   // Human-readable name

    async fn create_message(                                  // Non-streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<ProviderResponse, ProviderError>;

    async fn create_message_stream(                           // Streaming request
        &self,
        request: ProviderRequest,
    ) -> Result<
        Pin<Box<dyn Stream<Item = Result<StreamEvent, ProviderError>> + Send>>,
        ProviderError,
    >;

    async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError>;  // Model list
    async fn check_connectivity(&self) -> Result<ProviderStatus, ProviderError>; // Health check
    fn capabilities(&self) -> ProviderCapabilities;           // Capability declaration
}

Both input and output use provider-agnostic unified types:

pub struct ProviderRequest {
    pub model: String,
    pub messages: Vec<Message>,          // Unified conversation format
    pub system_prompt: Option<SystemPrompt>,
    pub tools: Vec<ToolDefinition>,      // Unified tool definitions
    pub max_tokens: u32,
    pub temperature: Option<f64>,
    pub thinking: Option<ThinkingConfig>, // Deep thinking configuration
    pub provider_options: Value,          // Provider-specific parameters
}

pub struct ProviderResponse {
    pub id: String,
    pub content: Vec<ContentBlock>,      // Unified content blocks
    pub stop_reason: StopReason,         // Unified stop reason
    pub usage: UsageInfo,                // Token usage
    pub model: String,
}

The core value of the normalization layer: whether the underlying is Claude, GPT, or Gemini, upper-layer code only sees ProviderRequest and ProviderResponse.

// boxagnts-api/src/registry.rs
pub struct ProviderRegistry {
    providers: HashMap<ProviderId, Arc<dyn LlmProvider>>,
    default_provider_id: ProviderId,
}

fn provider_from_key(provider_id: &str, key: String) -> Option<Arc<dyn LlmProvider>> {
    match provider_id {
        // Native implementations — each with its own API format
        "anthropic" => Some(Arc::new(AnthropicProvider::from_config(...))),
        "openai"    => Some(Arc::new(OpenAiProvider::new(key))),
        "google"    => Some(Arc::new(GoogleProvider::new(key))),
        "github-copilot" => Some(Arc::new(CopilotProvider::new(key))),
        "cohere"    => Some(Arc::new(CohereProvider::new(key))),

        // OpenAI-compatible providers — share the same conversion logic, only change base_url
        "deepseek", "groq", "ollama", "mistral", "xai",
        "perplexity", "openrouter", "siliconflow", "moonshot",
        "zhipu", "stepfun", "fireworks", "llamacpp",
        "sambanova", "huggingface", "nvidia", "cerebras",
        // ... 30+ OpenAI-compatible providers in total
        _ => None,
    }
}

Three implementation strategies:

Type Representative Conversion Strategy Count
Native Anthropic
claude-sonnet-4-5 Near-zero conversion (internal format = Anthropic format) 1
Native OpenAI
gpt-4o, o3 ProviderRequest → Chat Completions 1
Native Google
gemini-2.5-flash ProviderRequest → generateContent 1
OpenAI Compatible
deepseek, groq, ollama, etc. Same logic as OpenAI, only URL changes 30+
Other Native
github-copilot, cohere Independent format conversion 3+

Anthropic, OpenAI, Google Gemini — three APIs with vast differences in message format. Understanding these differences is essential to understanding the value of the conversion layer.

Feature Anthropic OpenAI Google Gemini
Location Top-level "system" field
messages[0], role:"system"
Top-level "systemInstruction" field
Type string or ContentBlock array string only content parts array only
// Anthropic — top-level standalone field
{"model": "claude-sonnet-4-5", "system": "You are helpful.", "messages": [...]}

// OpenAI — embedded in messages array
{"model": "gpt-4o", "messages": [{"role":"system","content":"You are helpful."}, ...]}

// Google — uses systemInstruction field, structure differs from messages
{
  "systemInstruction": {"parts": [{"text": "You are helpful."}]},
  "contents": [{"role": "user", "parts": [{"text": "Hello"}]}]
}
Feature Anthropic OpenAI
Field "tools": [{name, description, input_schema}]
"tools": [{type:"function", function:{...}}]
"tools": [{functionDeclarations: [{name, description, parameters}]}]
Wrapping Layers 0 1 1, with different nesting names
// Anthropic — native block in content array
{"content": [{"type":"tool_use", "id":"toolu_01A", "name":"read", "input": {...}}]}

// OpenAI — standalone tool_calls array, arguments is JSON string
{"tool_calls": [{"id":"call_abc", "function": {"name":"read", "arguments": "{\"path\":\"...\"}"}}]}

// Google — functionCall embedded in parts, args is JSON object
{"candidates": [{"content": {"parts": [{"functionCall": {"name":"read", "args": {...}}}]}}]}
// Anthropic — tool_result is a block in the user message content array
{"role":"user", "content": [{"type":"tool_result", "tool_use_id":"toolu_01A", "content":"..."}]}

// OpenAI — requires a separate role: "tool" message
{"role":"tool", "tool_call_id":"call_abc", "content":"..."}

// Google — functionResponse embedded in user content parts
{"role":"user", "parts": [{"functionResponse": {"name":"read", "response": {...}}}]}
Anthropic OpenAI
user
user
user
assistant
assistant
model

Google uses model

instead of assistant

— this is the most easily overlooked but most error-prone difference.

OpenAiProvider

is the most complete example of the conversion layer:

// boxagnts-api/src/providers/openai.rs
impl OpenAiProvider {
    fn to_openai_messages(
        messages: &[Message],
        system_prompt: Option<&SystemPrompt>,
    ) -> Vec<Value> {
        let mut result: Vec<Value> = Vec::new();

        // Step 1: system prompt → role: "system" message
        if let Some(sys) = system_prompt {
            result.push(json!({"role": "system", "content": sys_text}));
        }

        for msg in messages {
            match msg.role {
                Role::User => {
                    // User messages may mix text and tool_result blocks
                    // tool_result needs to be split into separate role: "tool" messages
                    Self::append_user_messages(&mut result, &msg.content);
                }
                Role::Assistant => {
                    let (text, tool_calls) = Self::assistant_content_to_openai(&msg.content);
                    result.push(json!({
                        "role": "assistant",
                        "content": text,
                        "tool_calls": tool_calls
                    }));
                }
            }
        }
        result
    }

    fn to_openai_tools(tools: &[ToolDefinition]) -> Vec<Value> {
        tools.iter().map(|td| {
            json!({
                "type": "function",
                "function": {
                    "name": td.name,
                    "description": td.description,
                    "parameters": td.input_schema
                }
            })
        }).collect()
    }
}

The most complex part is tool_use_id sanitization — Anthropic's tool IDs (e.g., toolu_01Bx...

) may contain characters that OpenAI does not accept.

GoogleProvider

shows how to handle an API format that is different from both Anthropic and OpenAI:

// boxagnts-api/src/providers/google.rs
// URL pattern completely different from OpenAI's /v1/chat/completions
fn generate_url(&self, model: &str) -> String {
    format!(
        "{}/v1beta/models/{}:generateContent?key={}",
        self.base_url, model, self.api_key  // API Key in URL query parameters!
    )
}

Key differences from OpenAI:

Difference Google Gemini OpenAI
API Key Location URL query parameter ?key=
HTTP Header Authorization: Bearer
Endpoint Format /v1beta/models/{model}:generateContent
/v1/chat/completions
Streaming Endpoint /v1beta/models/{model}:streamGenerateContent?alt=sse
/v1/chat/completions + stream:true
Message Roles
user / model (not assistant)
user / assistant
Tool Results
functionResponse in parts
Separate role: tool message
Image Input
inlineData base64
image_url or content parts

ThinkingConfig

is the normalized deep thinking configuration — but different providers handle it completely differently:

// Normalized configuration
pub struct ThinkingConfig {
    pub budget_tokens: u32,   // Thinking token budget
}

// When building ProviderRequest, decides whether to pass based on provider capabilities
let provider_request = ProviderRequest {
    // ...
    thinking: if caps.thinking {
        effective_thinking_budget
            .map(|b| ThinkingConfig::enabled(b))
    } else {
        None  // This provider doesn't support thinking, don't pass
    },
};
Provider Thinking Support How It's Passed
Anthropic (Claude 3.5+) "thinking": {"type": "enabled", "budget_tokens": N}
Google (Gemini 2.5+) "thinkingConfig": {"thinkingBudget": N}
OpenAI (o1/o3 series) Partial Via reasoning_effort parameter
Other OpenAI Compatible Mostly unsupported Not passed

At request construction time, ProviderCapabilities

declares each provider's capabilities:

pub struct ProviderCapabilities {
    pub thinking: bool,              // Whether deep thinking is supported
    pub prompt_caching: bool,        // Whether prompt caching is supported
    pub image_input: bool,           // Whether image input is supported
    pub native_tool_use: bool,       // Whether native tool calling exists
    pub supports_streaming: bool,    // Whether streaming responses are supported
    // ...
}

OpenAI-compatible providers' APIs are roughly compatible, but all have subtle differences. ProviderQuirks

handles these:

pub struct ProviderQuirks {
    /// Specific error message patterns for context overflow
    pub overflow_patterns: Vec<String>,
    /// Local services that don't require API Keys (e.g., Ollama, LM Studio)
    pub no_api_key_required: bool,
    /// Whether streaming responses include usage info
    pub include_usage_in_stream: bool,
    /// Providers like DeepSeek need the reasoning_content field
    pub reasoning_field: Option<String>,
}

For example, DeepSeek's streaming response returns reasoning content with a field name different from OpenAI's — adapted via reasoning_field

. Ollama's context overflow error message is "exceeds the available context size"

, while LM Studio's is "greater than the context length"

— adapted via overflow_patterns

.

Streaming responses are also completely different across the three APIs:

Feature Anthropic (SSE) OpenAI (SSE) Google (SSE)
Event Granularity High: 6 event types (start/delta/stop × 2) Low: each chunk is a complete delta Medium: pushed by chunk, but structure is flat
Tool call Increment Fragmented send of input_json_delta
Single send of complete arguments string
Single send of complete functionCall
Termination Signal
message_stop event
data: [DONE] marker
Stream ends naturally
Need to Reassemble by index Yes (reassemble by index for multiple tool_use) Yes Yes

All three formats are normalized to the same StreamEvent

enum:

pub enum StreamEvent {
    MessageStart { id, model, usage },
    ContentBlockStart { index, content_block },
    TextDelta { text },
    ThinkingDelta { thinking },
    InputJsonDelta { index, partial_json },
    ContentBlockStop { index },
    MessageDelta { stop_reason, usage },
    MessageStop,
}

Each provider's error format is also different:

// Unified error types
pub enum ProviderError {
    Auth { ... },             // Authentication failure
    RateLimited { ... },      // Rate limiting
    ContextOverflow { ... },  // Context exceeds window (matched via ProviderQuirks)
    InvalidRequest { ... },   // Invalid request parameters
    ServerError { ... },      // Server error
    StreamError { ... },      // Stream interruption
    Other { ... },            // Unknown error
}

In the query loop, specific errors trigger specific recovery strategies:

RateLimited / Overloaded → Switch to fallback_model
ContextOverflow → Trigger auto_compact
StreamError (stall) → Retry (max 2 times, 45s timeout)
Auth → Unrecoverable, return error

BoxAgnts defines environment variable name mappings for each provider:

// boxagnts-workspace/src/config.rs
pub fn api_key_env_vars_for_provider(provider_id: &str) -> &'static [&'static str] {
    match provider_id {
        "anthropic" => &["ANTHROPIC_API_KEY"],
        "openai" => &["OPENAI_API_KEY"],
        "google" => &["GOOGLE_API_KEY", "GOOGLE_GENERATIVE_AI_API_KEY"],
        "deepseek" => &["DEEPSEEK_API_KEY"],
        "mistral" => &["MISTRAL_API_KEY"],
        "xai" => &["XAI_API_KEY"],
        "zhipu" => &["ZHIPU_API_KEY"],
        // ... 40+ provider environment variables
    }
}

Three-tier priority: Environment Variables > User Config JSON > No Default

. This design supports different scenarios such as multi-tenancy, CI/CD, and local development.

BoxAgnts' model abstraction layer solves the essential problem of "one set of code adapting to all APIs":

┌──────────────────────────────────────────────┐
│  boxagnts-query (Agent reasoning loop)        │
│  Only uses ProviderRequest / ProviderResponse │
└────────────────────┬─────────────────────────┘
                     │
┌────────────────────▼─────────────────────────┐
│  LlmProvider trait                            │
│  + ProviderRegistry (40+ providers)           │
├──────────┬──────────┬──────────┬─────────────┤
│Anthropic │ OpenAI   │ Google   │ OpenAiCompat │
│Provider  │ Provider │ Provider │ (30+ vendors)│
│(Near-zero│ (Full    │ (Independent│ (Shares    │
│ conversion)│ format  │ format    │ OpenAI      │
│          │ conversion)│ conversion)│ conversion  │
│          │          │          │ +Quirks)     │
└──────────┴──────────┴──────────┴─────────────┘

Three key capabilities:

--model

parameterrun_query_loop()

has no idea what's underneathThis is not a simple "adapter pattern" — it's a production-grade abstraction validated against 40+ real-world APIs.

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/boxagnts-introductio…] indexed:0 read:9min 2026-05-31 ·