BoxAgnts' middle layer — the Agent Toolbox — is the brain and hands of the system. It consists of six core modules responsible for three things: understanding your intent, dispatching the right tools, and feeding back execution results. This article takes a deep dive into the architectural design and key implementations of each module.
What happens when you type "Help me analyze the code structure of this Rust project" in the Dashboard and hit send?
User Message
│
▼
┌─────────────────────────────────────────────────────────────┐
│ boxagnts-api Unified API Abstraction Layer │
│ LlmProvider trait → 20+ Providers → Message Normalization │
├─────────────────────────────────────────────────────────────┤
│ boxagnts-query Agent Query Loop │
│ run_query_loop() → Multi-turn Conversation → Tool Dispatch → Auto Recovery │
├─────────────────────────────────────────────────────────────┤
│ boxagnts-tools + tools-manager + wasm-tools │
│ Tool trait → Built-in Tools + WASM Tools → Execution │
├─────────────────────────────────────────────────────────────┤
│ boxagnts-gateway Gateway & Scheduling │
│ Cron Scheduler + Site Hosting │
├─────────────────────────────────────────────────────────────┤
│ boxagnts-workspace Memory & Configuration │
│ SQLite + JSON Config + Conversation History │
└─────────────────────────────────────────────────────────────┘
Let's break down each one.
This is the interface layer between the middle layer and the external AI world. It solves the most painful problem in AI tool development: every model provider's API is different, but your code should not pay the price for that.
LlmProvider
Trait: The Foundation of Polymorphism The core interface that all provider adapters must implement:
#[async_trait]
pub trait LlmProvider: Send + Sync {
fn id(&self) -> &ProviderId; // Unique identifier "anthropic", "openai"
fn name(&self) -> &str; // Human-readable name
// Non-streaming request
async fn create_message(&self, request: ProviderRequest)
-> Result<ProviderResponse, ProviderError>;
// Streaming request (returns Pin<Box<dyn Stream>>)
async fn create_message_stream(&self, request: ProviderRequest)
-> Result<Pin<Box<dyn Stream<Item = Result<StreamEvent, ProviderError>> + Send>>, ProviderError>;
// List available models
async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError>;
}
This trait design has three elegant aspects:
async_trait
macro, compatible with the Tokio async runtimeProviderError
BoxAgnts supports an extremely wide range of model providers:
| Category | Providers | Independent Implementation File |
|---|---|---|
| International Mainstream | ||
| OpenAI, Anthropic, Google, Azure, Bedrock | Individual files | |
| Open-Source Compatible | ||
| Deepseek, Mistral, Groq, TogetherAI, Fireworks | openai_compat.rs | |
| Enterprise Services | ||
| Copilot, CodeX, Cohere, Perplexity | Individual files | |
| Domestic Platforms | ||
| MiniMax, Alibaba Cloud (Qwen), Zhipu, Moonshot, SiliconFlow | Individual files | |
| Others | ||
| Venus, Nebius, Novita, OVHCloud | Individual files |
Key design pattern — Provider + Transformer dual-layer architecture:
Raw User Message
│
▼
┌────────────────┐
│ Transformer │ ← Converts internal message format to provider-specific format
│ (per-provider)│
└───────┬────────┘
▼
┌────────────────┐
│ Provider │ ← Handles authentication, HTTP requests, stream parsing
│ (per-provider)│
└───────┬────────┘
▼
AI Response
│
▼
┌────────────────┐
│ Transformer │ ← Converts provider response back to internal unified format
└────────────────┘
QueryConfig
contains a provider_registry
field that allows dynamic provider selection at runtime. This means you can:
fallback_model
to automatically switch to a backup model when the primary model is overloadedModelRegistry
BoxAgnts predefines environment variable mappings for each provider:
pub fn api_key_env_vars_for_provider(provider_id: &str) -> &'static [&'static str] {
match provider_id {
"anthropic" => &["ANTHROPIC_API_KEY"],
"openai" => &["OPENAI_API_KEY"],
"deepseek" => &["DEEPSEEK_API_KEY"],
"zhipu" => &["ZHIPU_API_KEY"],
"minimax" => &["MINIMAX_API_KEY"],
// ... 30+ providers
}
}
This means you can inject API keys through three methods — environment variables, configuration files, or the Dashboard UI — maximizing flexibility while maintaining security boundaries.
This layer is the absolute soul of BoxAgnts. The run_query_loop()
function implements the complete Agent reasoning loop, about 300 lines of code, yet handles an amazing number of edge cases.
loop {
turn += 1;
// 0. Check cancellation signal
if cancel_token.is_cancelled() { return Cancelled; }
// 1. Check max turns limit
if turn > effective_max_turns { return EndTurn; }
// 2. Inject pending user messages (multimodal interaction)
if let Some(queue) = pending_messages.as_deref_mut() {
for text in queue.drain(..) { /* append as user message */ }
}
// 3. Auto context compaction
compact_state.maybe_compact(messages, config);
// 4. Build API request
let request = build_request(messages, tools, config);
// 5. Send to AI model (supports streaming)
let response = client.create_message_stream(request).await;
// 6. Parse ContentBlocks from response
for block in response.content {
match block {
ContentBlock::Text { text } => { /* accumulate text response */ }
ContentBlock::ToolUse { name, input, .. } => {
// Match and execute tool
let tool = find_tool(name);
let result = tool.execute(input, tool_ctx).await;
messages.push(tool_result); // Inject result into conversation
}
ContentBlock::Thinking { thinking, .. } => {
// Handle deep thinking content (not shown to user)
}
}
}
// 7. If model ends → return final message
if stop_reason == "end_turn" { return EndTurn; }
}
When the model runs out of token quota in a single response, the query loop does not simply return a truncated result. Instead, it automatically sends a carefully designed recovery message:
"Output token limit hit. Resume directly — no apology, no recap of what
you were doing. Pick up mid-thought if that is where the cut happened.
Break remaining work into smaller pieces."
This message is remarkably restrained in design: "no apology, no recap, pick up from the cut, break down tasks" — conveying maximum instruction with minimum tokens. Retries up to 3 times (MAX_TOKENS_RECOVERY_LIMIT = 3
) to avoid infinite loops.
compact.rs
implements an intelligent compression strategy. When conversation history approaches the model's context window limit, it summarizes early messages — preserving key information (file paths, error messages, important decisions) while discarding redundant intermediate steps. This strategy ensures that even extremely complex multi-turn tasks (such as refactoring an entire codebase) won't cause the Agent to "lose its memory" due to context overflow.
// query.rs — Auto switch to backup model on overload errors
if is_overloaded_error(&err) && fallback_model.is_some() && !used_fallback {
effective_model = fallback_model;
used_fallback = true;
continue; // Retry with backup model
}
When the primary model (e.g., Claude Sonnet) returns an overload error during high-load periods, the system automatically switches to a backup model (e.g., Deepseek), ensuring tasks are not interrupted. This mechanism is completely transparent to the user.
pub enum QueryOutcome {
BudgetExceeded { cost_usd: f64, limit_usd: f64 },
// ...
}
After each turn, the query loop checks whether the accumulated cost exceeds the budget cap. Every API call is tracked via CostTracker
recording model and token consumption, ensuring costs are controllable. Budget overruns return clear error messages rather than silently overspending.
The ContentBlock
enum defines 14 content types, covering the full spectrum of interactions from plain text to deep thinking:
pub enum ContentBlock {
Text { text: String }, // Plain text
Image { source: ImageSource }, // Image
ToolUse { id, name, input }, // Tool call
ToolResult { tool_use_id, content, is_error }, // Tool result
Thinking { thinking, signature }, // Deep thinking
Document { source, title, context }, // Document reference
UserLocalCommandOutput { command, output }, // Shell command output
UserCommand { name, args }, // User command
UserMemoryInput { key, value }, // User memory
SystemAPIError { message, retry_secs }, // API error
CollapsedReadSearch { tool_name, paths }, // Collapsed search results
TaskAssignment { id, subject, description }, // Sub-task assignment
// ...
}
This fine-grained content typing allows the frontend to render each type with specialized treatment — error blocks show red borders, task assignment blocks show cyan borders, collapsed search results displayed as single-line summaries.
This is one of the most stunning middle-layer designs in BoxAgnts. managed_orchestrator.rs
implements a hierarchical Agent architecture:
User
│
▼
┌───────────────────────┐
│ Manager Agent │ ← Uses strong model (e.g., Claude Opus)
│ Analyze tasks → Break down → Assign │
└───────┬───────────────┘
│
┌────────┼────────┐
▼ ▼ ▼
┌────────┐┌────────┐┌────────┐
│Executor││Executor││Executor│ ← Uses economical model (e.g., Claude Sonnet/Deepseek)
│Subtask1││Subtask2││Subtask3│
└────┬───┘└────┬───┘└────┬───┘
│ │ │
└────────┼─────────┘
▼
Manager aggregates results
│
▼
Final Output
pub struct ManagedAgentConfig {
pub enabled: bool,
pub manager_model: String, // Manager model (e.g., "claude-opus-4-6")
pub executor_model: String, // Executor model (e.g., "claude-sonnet-4-6")
pub executor_max_turns: u32, // Max turns per executor
pub max_concurrent_executors: u32, // Max parallel executors
pub total_budget_usd: Option<f64>, // Total budget cap
pub executor_isolation: bool, // Whether to isolate Git worktrees
}
The Manager Agent's system prompt precisely defines its role:
You are the MANAGER, the planning and reasoning layer.
You coordinate work but do NOT execute tasks using file/bash tools directly.
All implementation work is delegated to executor agents (via the Agent tool).
Each executor uses {executor_model}, with a maximum of {max_turns} turns.
You may run up to {max_concurrent} executors in parallel.
The Executor's prompt requires "complete self-containment" — executors cannot see the Manager's conversation history and must include all context in their prompt. This avoids context leakage and reduces token consumption.
This is the most critical interface definition in all of BoxAgnts. Every new tool only needs to implement this trait:
#[async_trait]
pub trait Tool: Send + Sync {
fn name(&self) -> &'static str;
fn description(&self) -> &'static str;
fn input_schema(&self) -> Value; // JSON Schema defining parameters
async fn execute(&self, input: Value, ctx: &ToolContext) -> ToolResult;
}
pub struct ToolContext {
pub cost_tracker: Arc<CostTracker>, // Cost tracker
pub session_id: Option<String>, // Session ID
pub current_turn: Arc<AtomicUsize>, // Current turn
pub non_interactive: bool, // Non-interactive mode
pub config: Config, // Global configuration
pub managed_agent_config: Option<ManagedAgentConfig>,
pub allowed_outbound_hosts: Vec<String>, // Outbound network whitelist
pub block_url: Option<String>, // Blocked URLs
}
ToolContext
is the tool's "passport" — carrying various contextual information such as permissions, sessions, costs, and networking. Every tool can access the required system state through it during execution.
// tools-manager/src/lib.rs
pub fn all_tools() -> Vec<Box<dyn Tool>> {
vec![
// Rust native tools
Box::new(AskUserQuestionTool),
Box::new(BriefTool),
Box::new(EnterPlanModeTool),
Box::new(ExitPlanModeTool),
Box::new(SleepTool),
Box::new(SkillTool),
Box::new(ToolSearchTool),
// WASM sandbox tools — same interface, different implementation
Box::new(WasmTool::new("read", "file-read-component.wasm", ...)),
Box::new(WasmTool::new("write", "file-write-component.wasm", ...)),
Box::new(WasmTool::new("edit", "file-edit-component.wasm", ...)),
Box::new(WasmTool::new("glob", "file-glob-component.wasm", ...)),
Box::new(WasmTool::new("bash", "bash-component.wasm", ...)),
Box::new(WasmTool::new("web_fetch", "web-fetch-component.wasm", ...)),
Box::new(WasmTool::new("js_exec", "boxedjs-execute-component.wasm", ...)),
]
}
Notice that Rust native tools and WASM tools are placed in the same Vec<Box<dyn Tool>>
— to the AI model, they are completely equivalent. This is the power of interface-oriented programming.
cron/scheduler.rs
builds a complete scheduled task system based on tokio_cron_scheduler
:
// Core scheduling logic
let cron_job = Job::new_async(cron_expr, move |_uuid, _lock| {
Box::pin(async move {
let handle = job::execute(prompt, model).await;
// Execution with timeout + result logging
let result = timeout(Duration::from_secs(timeout), fut).await;
append_execution_log(job_id, job_name, success, message).await;
})
});
Key features:
tokio::time::timeout
CancellationToken
Site data managed by site/store.rs
is persisted via SQLite, supporting CRUD operations. Combined with the frontend SitesPage, users can:
/sites/{name}/
pathThe workspace module handles all persistence and configuration management:
| Function | Storage | Key Implementation |
|---|---|---|
| Conversation History | SQLite (rusqlite) | Organized by session, supports CRUD |
| User Authentication | Password hash storage | Verified for remote access |
| Global Configuration | JSON file | |
Settings::load() to load |
||
| API Keys | Environment variables / JSON | Three-tier priority: ENV > Config > Default |
| AGENTS.md | Filesystem | Injected into system prompt each conversation |
| Cron Tasks | SQLite | Persisted storage + loaded at startup |
| Site Config | SQLite | Persisted storage + loaded at startup |
Design highlight: configuration and state are separated. Configuration is JSON files (human-readable and editable), state is SQLite (efficient queries and transactions). This distinction avoids the common pitfall of "configuration file bloat."
QueryConfig
is a massive configuration struct with 20 fields, covering every dimension of an Agent query:
pub struct QueryConfig {
pub model: String, // Model name
pub max_tokens: u32, // Max output tokens
pub max_turns: u32, // Max reasoning turns
pub system_prompt: Option<String>, // System prompt
pub thinking_budget: Option<u32>, // Thinking budget (deep reasoning)
pub temperature: Option<f32>, // Temperature parameter
pub tool_result_budget: usize, // Total char cap for tool results (50000)
pub effort_level: Option<EffortLevel>, // Effort level (affects thinking_budget)
pub max_budget_usd: Option<f64>, // USD budget cap
pub fallback_model: Option<String>, // Backup model
pub agent_definition: Option<AgentDefinition>, // Agent definition
pub managed_agents: Option<ManagedAgentConfig>, // Managed mode
pub output_style: OutputStyle, // Output style
// ... and more
}
This struct demonstrates a core design philosophy of BoxAgnts: give control to the user, but provide reasonable defaults. Every field can be overridden, but none are required — defaults cover 90% of use cases.
The middle-layer Agent Toolbox is the capability core of BoxAgnts:
| Module | Responsibility | Key Highlight |
|---|---|---|
| boxagnts-api | ||
| Multi-model unified access | LlmProvider trait, 20+ Providers, Transformer conversion | |
| boxagnts-query | ||
| Agent reasoning loop | Token recovery, context compaction, Fallback switching, budget control | |
| managed_orchestrator | ||
| Managed Agent architecture | Manager-Executor layering, parallel execution, budget management | |
| boxagnts-tools | ||
| Unified tool abstraction | Tool trait, ToolContext | |
| tools-manager | ||
| Central tool registry | Rust native + WASM unified as Vec> | |
| boxagnts-gateway | ||
| Time and space extension | Cron scheduler, Site hosting | |
| boxagnts-workspace | ||
| Memory system | SQLite + JSON dual-layer storage |