{"slug": "boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation", "title": "BoxAgnts Introduction (6) — Agent Multi-Turn Conversation and Tool/Skill Invocation", "summary": "BoxAgnts implements a multi-turn agent conversation system that requires three API calls and two tool executions to complete a single user request, such as reading a config file and changing a port number. The system supports three pre-installed agent roles—build, plan, and explore—each with different permission levels and prompt characteristics that can override model selection and maximum turn limits. The core `run_query_loop()` function manages conversation history, tool execution within WASM sandboxes, streaming push, and context management across multiple API interactions.", "body_md": "If you've only chatted with ChatGPT, you might think an AI Agent is simply \"send a prompt to the API, display the response.\"\n\nThe reality is far more complex. Here is a complete Agent interaction flow in BoxAgnts:\n\n```\nUser input: \"Help me read config.toml and change port to 9090\"\n\n1. User message added to conversation history\n2. Build system prompt (tool list + skill list + AGENTS.md + Agent role definition)\n3. Call LLM API → stream receive response\n4. AI decides to call tool: tool_use(\"read\", {path: \"config.toml\"})\n5. Execute read tool (within WASM sandbox)\n6. Tool result injected into conversation history\n7. Call API again → AI analyzes config\n8. AI decides to call tool: tool_use(\"edit\", {path: \"config.toml\", old: \"port = 8080\", new: \"port = 9090\"})\n9. Execute edit tool\n10. Tool result injected into conversation\n11. Call API again → AI responds: \"Port has been changed from 8080 to 9090\"\n12. end_turn → Conversation ends\n```\n\nThis process involves 3 API calls, 2 tool executions, streaming push, and context management. This article dissects the design and implementation of each link.\n\nBefore starting the reasoning loop, the Agent's \"role\" needs to be defined. BoxAgnts comes with three pre-installed Agents:\n\n```\n// boxagnts-workspace/src/config.rs\npub struct AgentDefinition {\n    pub description: \"Option<String>,    // Description\"\n    pub model: Option<String>,          // Model override\n    pub temperature: Option<f64>,       // Temperature override\n    pub prompt: Option<String>,         // System prompt prefix\n    pub access: String,                 // Permission: full / read-only / search-only\n    pub visible: bool,                  // Whether visible in @agent autocomplete\n    pub max_turns: Option<u32>,         // Max turns override\n    pub color: Option<String>,          // Terminal display color\n}\n```\n\nThe three pre-installed Agent roles:\n\n| Agent | Permission | Prompt Characteristics | Use Cases |\n|---|---|---|---|\nbuild |\nfull | \"You are the build agent. Focus on implementing...\" | Coding, modifying files |\nplan |\nread-only | \"You are the plan agent. You can read files and analyze...\" | Code analysis, architecture design |\nexplore |\nsearch-only | \"Fast search-only agent for code exploration\" | Quick search, file location |\n\nThe `prompt`\n\nfield in the Agent definition is injected at the very front of the system prompt when the query loop starts:\n\n``` js\n// boxagnts-query/src/query.rs\nif let Some(ref agent) = config.agent_definition {\n    if let Some(ref agent_prompt) = agent.prompt {\n        patched.system_prompt = Some(match &config.system_prompt {\n            Some(existing) => format!(\"{}\\n\\n{}\", agent_prompt, existing),\n            None => agent_prompt.clone(),\n        });\n    }\n}\n```\n\nAdditionally, the Agent can override the model and max turns:\n\n``` js\nlet effective_model = if let Some(ref agent) = config.agent_definition {\n    agent.model.clone().unwrap_or_else(|| config.model.clone())\n} else {\n    config.model.clone()\n};\n\nlet effective_max_turns = config.agent_definition\n    .as_ref()\n    .and_then(|a| a.max_turns)\n    .unwrap_or(config.max_turns);\n```\n\nThis means users can use Agent definitions to implement \"different models and roles at different stages of the same session\" — for example, using a read-only slow-thinking model during the planning phase and a full-access fast model during the execution phase.\n\n`run_query_loop()`\n\nis the most core function in BoxAgnts, located in the `boxagnts-query`\n\ncrate:\n\n```\npub async fn run_query_loop(\n    client: &AnthropicClient,        // API client\n    messages: &mut Vec<Message>,     // Conversation history (mutable reference)\n    tools: &[Box<dyn Tool>],         // Tool collection\n    tool_ctx: &ToolContext,          // Tool execution context\n    config: &QueryConfig,            // Loop configuration\n    cost_tracker: Arc<CostTracker>,  // Cost tracking\n    event_tx: Option<mpsc::UnboundedSender<QueryEvent>>, // Event push\n    cancel_token: CancellationToken, // Cancellation signal\n    pending_messages: Option<&mut Vec<String>>, // Pending message queue\n) -> QueryOutcome\n```\n\nThis function signature is itself an architectural document. Each parameter is a design decision:\n\n| Parameter | Design Intent |\n|---|---|\n`client` |\nSingle entry point, but internally switches 20+ models via ProviderRegistry |\n`messages: &mut Vec<Message>` |\nDirectly modifies conversation history, appends content each iteration |\n`tools: &[Box<dyn Tool>]` |\nType-erased tool collection, AI calls by name |\n`tool_ctx` |\nCarries work_dir, allowed_hosts and other sandbox config |\n`event_tx` |\nReal-time push of per-turn status to Dashboard / TUI |\n`cancel_token` |\nUser can interrupt loop at any time |\n`pending_messages` |\nInsert commands mid-execution (e.g., user sends new message during tool execution) |\n\n```\n┌─────────────────────────────────────────────┐\n│                  loop {                       │\n│                                               │\n│  ① Check termination conditions               │\n│     · turn > max_turns ? → EndTurn           │\n│     · cancel_token ?    → Cancelled          │\n│     · budget exceeded?  → BudgetExceeded     │\n│                                               │\n│  ② Preprocess messages                       │\n│     · drain pending_messages queue           │\n│     · apply_tool_result_budget (truncate old results) │\n│     · auto_compact (context compression)      │\n│                                               │\n│  ③ Build system prompt + Call LLM API        │\n│     · Inject Agent definition / AGENTS.md    │\n│     · Build CreateMessageRequest             │\n│     · Stream receive StreamEvent              │\n│     · Accumulate text / thinking / tool_use blocks │\n│                                               │\n│  ④ Process response                          │\n│     · end_turn → return                       │\n│     · tool_use → parallel execute tools → inject results → continue │\n│     · max_tokens → resume conversation → continue │\n│                                               │\n│  ⑤ Error recovery                            │\n│     · overloaded → switch fallback model     │\n│     · stream stall → retry (max 2 times)      │\n│                                               │\n│  }                                            │\n└─────────────────────────────────────────────┘\n```\n\nBefore each API call, BoxAgnts builds a complete system prompt:\n\n``` php\nfn build_system_prompt(config: &QueryConfig) -> SystemPrompt {\n    let opts = SystemPromptOptions {\n        custom_system_prompt: config.system_prompt.clone(),     // User custom\n        append_system_prompt: config.append_system_prompt.clone(), // Appended content\n        output_style: config.output_style,                      // Output style\n        custom_output_style_prompt: config.output_style_prompt.clone(),\n        working_directory: config.working_directory.clone(),    // Current working directory\n        ..Default::default()\n    };\n\n    let text = boxagnts_core::system_prompt::build_system_prompt(&opts);\n    SystemPrompt::Text(text)\n}\n```\n\nThe system prompt structure is hierarchical:\n\n```\n┌──────────────────────────────────────┐\n│ Agent Role Definition (build/plan/explore) │  ← AgentDefinition.prompt\n├──────────────────────────────────────┤\n│ Core Capability Declaration           │\n│ · Available tool list (16+)           │  ← Dynamically generated from tools parameter\n│ · Skill list                          │  ← Discovered by SkillTool\n│ · Output format requirements          │\n│ · Security boundaries                 │\n├──────────────────────────────────────┤\n│ AGENTS.md content                     │  ← User project-level behavior spec\n├──────────────────────────────────────┤\n│ Dynamic Boundary Marker               │\n│ --- Above cached, below not cached ---│\n├──────────────────────────────────────┤\n│ Session-specific information          │  ← Current working directory, time, etc.\n└──────────────────────────────────────┘\n```\n\nThe `--- Above cached, below not cached ---`\n\ndivider is a clever design — Anthropic API supports prompt caching, and caching the above portion can significantly reduce token costs per API call.\n\nWhen the AI's response hits the `max_tokens`\n\nlimit, the model cuts off output midway. A normal API call ends here — but the Agent cannot stop.\n\nBoxAgnts' solution is clever:\n\n``` js\n// boxagnts-query/src/query.rs\nconst MAX_TOKENS_RECOVERY_LIMIT: u32 = 3;\n\nconst MAX_TOKENS_RECOVERY_MSG: &str =\n    \"Output token limit hit. Resume directly — no apology, no recap of what \\\n     you were doing. Pick up mid-thought if that is where the cut happened. \\\n     Break remaining work into smaller pieces.\";\n```\n\nWhen `stop_reason == \"max_tokens\"`\n\nis detected:\n\n`MAX_TOKENS_RECOVERY_MSG`\n\n)The details in the prompt are worth noting — \"no apology, no recap\" — because an LLM's instinctive reaction after being cut off is \"Sorry, I was interrupted, let me start over...\" This leads to useless output. This prompt directly forbids that pattern.\n\nAn LLM's context window is finite. As conversations grow longer and tool results pile up, there comes a moment when things no longer fit.\n\nBoxAgnts' response is automatic compaction. The trigger condition is when token estimation reaches 90% of the context window:\n\n``` js\n// boxagnts-query/src/compact.rs\nconst AUTOCOMPACT_TRIGGER_FRACTION: f64 = 0.90;\nconst WARNING_PCT: f64 = 0.80;   // Warning at 80%\nconst CRITICAL_PCT: f64 = 0.95;  // Critical warning at 95%\n```\n\nThe core compaction strategy is calling another LLM to \"summarize\" the conversation history:\n\n```\nOriginal conversation (potentially thousands of messages)\n      │\n      ▼\nCompaction Prompt (NO_TOOLS_PREAMBLE → force summary mode)\n      │\n      ▼\nLLM generates structured summary:\n  · Primary Request and Intent\n  · Key Technical Concepts\n  · Files and Code Sections\n  · Errors and fixes\n  · Pending Tasks\n  · Current Work\n      │\n      ▼\nSummary replaces early conversation history, last 10 messages kept in original form\n```\n\nThe compaction prompt has a key design — `NO_TOOLS_PREAMBLE`\n\n:\n\n```\nCRITICAL: Respond with TEXT ONLY. Do NOT call any tools.\n- Do NOT use Read, Bash, Grep, Glob, Edit, Write, or ANY other tool.\n- You already have all the context you need in the conversation above.\n- Tool calls will be REJECTED and will waste your only turn.\n```\n\nIf the compacting LLM tries to call tools, the entire compaction is wasted. This preamble prevents such meta-recursion.\n\nWhen the LLM returns `stop_reason == \"tool_use\"`\n\n, the conversation enters the tool execution phase:\n\n```\n┌──────────────────────────────────────────────┐\n│  Phase 1: Sequential PreToolUse preprocessing │\n│  (Each tool block processed sequentially,     │\n│   can interrupt execution)                     │\n├──────────────────────────────────────────────┤\n│  Phase 2: Parallel execution of non-blocking   │\n│  tools                                         │\n│  join_all(futures) → all tools run concurrently │\n│  (Blocking tools return pre-computed error      │\n│   results)                                      │\n└──────────────────────────────────────────────┘\n```\n\nKey design point: **tool results are injected in user message format**. This leverages LLM message role semantics — the Assistant initiated the tool call, and the User (i.e., the system acting on behalf of the user) returned the tool result. The model understands this as \"the user answered your request\" and naturally proceeds to the next round of reasoning.\n\n```\n// boxagnts-query/src/lib.rs\nasync fn execute_tool(\n    name: &str,\n    input: &Value,\n    tools: &[Box<dyn Tool>],\n    ctx: &ToolContext,\n) -> ToolResult {\n    let tool = tools.iter().find(|t| t.name() == name);\n\n    match tool {\n        Some(tool) => {\n            debug!(tool = name, \"Executing tool\");\n            tool.execute(input.clone(), ctx).await\n        }\n        None => {\n            warn!(tool = name, \"Unknown tool requested\");\n            ToolResult::error(format!(\"Unknown tool: {}\", name))\n        }\n    }\n}\n```\n\nAn extremely simple implementation — a linear search. The `tools`\n\nvector typically has only a dozen elements, so the linear search overhead is negligible. Simplicity is more reliable than complexity.\n\nWhen task complexity exceeds a single Agent's capacity, BoxAgnts provides Managed Agent mode:\n\n```\n                    ┌──────────────────┐\n                    │  Manager Agent   │\n                    │  (Strong model    │\n                    │   like Opus)      │\n                    │  Plans and        │\n                    │  assigns only     │\n                    └────────┬─────────┘\n                             │\n              ┌──────────────┼──────────────┐\n              ▼              ▼              ▼\n        ┌──────────┐  ┌──────────┐  ┌──────────┐\n        │ Executor │  │ Executor │  │ Executor │\n        │ (Sonnet)  │  │ (Sonnet)  │  │ (Sonnet)  │\n        │ Subtask A│  │ Subtask B│  │ Subtask C│\n        └──────────┘  └──────────┘  └──────────┘\n            Parallel execution, each with independent context\n```\n\nThe Manager's system prompt is injected with managed mode instructions:\n\n``` php\npub fn managed_agent_system_prompt(config: &ManagedAgentConfig) -> String {\n    format!(r#\"\n## Managed Agent Mode\n\nYou are the MANAGER in a manager-executor architecture.\n\n### Your Role\n- You coordinate work but do NOT execute tasks directly.\n- Delegate all implementation work to executor agents.\n- Each executor uses model `{executor_model}` with up to {max_turns} turns.\n- You may run up to {max_concurrent} executors in parallel.\n\n### Workflow\n1. Analyze the user's request and break into sub-tasks.\n2. Spawn executors using the Agent tool.\n3. Review results. If insufficient, spawn follow-up executors.\n4. Synthesize all results into a coherent response.\n\"#, ...)\n}\n```\n\nThe Manager does not execute tools itself — it only plans, assigns, and synthesizes results. Executors are ordinary Agent instances with the full tool set. This pattern separates \"thinking\" from \"execution,\" both avoiding single-Agent context bloat and enabling true parallel processing.\n\nTools are the Agent's \"hands\" — reading files, writing files, executing commands. Skills are the Agent's \"professional knowledge\" — code review methodology, CSS refactoring guidelines, frontend component templates.\n\nA Skill is simply a `SKILL.md`\n\nfile:\n\n```\napp/extensions/skills/\n├── code-review/SKILL.md\n├── css-refactor-advisor/SKILL.md\n├── current-weather/SKILL.md\n├── weather-forecast/SKILL.md\n└── front-component-generator/SKILL.md\npub struct SkillTool;\n\n#[async_trait]\nimpl Tool for SkillTool {\n    fn name(&self) -> &str { \"skill-tool\" }\n\n    async fn execute(&self, input: Value, ctx: &ToolContext) -> ToolResult {\n        let params: SkillInput = serde_json::from_value(input)?;\n\n        // \"skill\": \"list\" → List all available skills\n        if params.skill == \"list\" {\n            return list_skills(&dirs).await;\n        }\n\n        // Find and read SKILL.md\n        let (skill_path, raw) = find_and_read_skill(&skill_name, &dirs).await?;\n\n        // Strip YAML frontmatter\n        let content = strip_frontmatter(&raw);\n\n        // Replace $ARGUMENTS placeholder\n        let prompt = if let Some(args) = &params.args {\n            content.replace(\"$ARGUMENTS\", args)\n        } else {\n            content.replace(\"$ARGUMENTS\", \"\")\n        };\n\n        ToolResult::success(prompt)\n    }\n}\n```\n\nSkill search prioritizes the workspace directory, then the app extensions directory:\n\n``` php\nasync fn skill_search_dirs(ctx: &ToolContext) -> Vec<PathBuf> {\n    let mut dirs = vec![\n        ctx.get_workspace_extensions_dir().await.join(\"skills\")  // Project-level\n    ];\n    dirs.push(ctx.get_app_extensions_dir().await.join(\"skills\")); // Global-level\n    dirs\n}\n```\n\nThis means you can define project-specific Skills under your project directory (e.g., \"Understand this project's build system\") while also using global Skills (e.g., \"Universal code review standards\"). Project-level Skills take priority over global Skills.\n\nThe most critical mechanism in Skill templates is `$ARGUMENTS`\n\n:\n\n```\n# Code Review Skill Template\n\nPlease review: $ARGUMENTS\n\nChecklist:\n1. Are functions too long (>50 lines)?\n2. Are there unhandled Result/Option cases?\n3. Are there unnecessary .clone() calls?\n4. Does naming follow Rust conventions?\n```\n\nWhen the AI calls with `args: \"src/main.rs\"`\n\n, `$ARGUMENTS`\n\nis replaced with `src/main.rs`\n\n. This turns Skills from \"static knowledge\" into \"parameterized tools.\"\n\nThe entire query loop pushes status in real-time through the `event_tx`\n\nchannel:\n\n```\npub enum QueryEvent {\n    Token { text: String },                    // Per-token push\n    ToolStart { tool_name, tool_id, input },   // Tool start\n    ToolEnd { tool_name, tool_id, result },    // Tool end\n    Status(String),                            // Status message\n}\n```\n\nThese events are pushed to the Dashboard frontend in real-time via WebSocket, allowing users to see every decision the Agent makes — not facing a black box.\n\nAn AI Agent's multi-turn conversation is a complex control system:\n\n```\nSystem Prompt → API Call → Stream Parse → Tool Detection → Tool Execution → Result Injection → Call Again\n     ↑                                                                         │\n     └───────────────── Loop until end_turn ───────────────────────────────────┘\n```\n\nThe robustness of this loop depends on:\n\n| Mechanism | Problem Solved |\n|---|---|\n| Agent definition system | Multi-role, multi-model switching |\n| System prompt construction | Agent worldview + prompt caching |\n| max_tokens recovery | Long output truncation |\n| auto_compact (structured summaries) | Context overflow beyond window |\n| tool_result_budget | Tool result accumulation |", "url": "https://wpnews.pro/news/boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation", "canonical_source": "https://dev.to/guyoung/boxagnts-introduction-6-agent-multi-turn-conversation-and-toolskill-invocation-4pan", "published_at": "2026-05-30 13:26:18+00:00", "updated_at": "2026-05-30 13:52:26.780682+00:00", "lang": "en", "topics": ["ai-agents", "artificial-intelligence", "large-language-models", "ai-tools", "ai-infrastructure"], "entities": ["BoxAgnts", "ChatGPT", "WASM"], "alternates": {"html": "https://wpnews.pro/news/boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation", "markdown": "https://wpnews.pro/news/boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation.md", "text": "https://wpnews.pro/news/boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation.txt", "jsonld": "https://wpnews.pro/news/boxagnts-introduction-6-agent-multi-turn-conversation-and-tool-skill-invocation.jsonld"}}