Beyond Prompt-Based Planning: MCP-Native Graph Planning-based Biomedical Agent System

A new biomedical agent system called BioManus uses graph-based planning over standardized tool interfaces to overcome bottlenecks in current prompt-based approaches. Developed by researchers, BioManus converts heterogeneous bioinformatics software into a unified MCP server ecosystem organized as a typed graph, enabling task-specific subgraph retrieval that decouples planning complexity from tool inventory size. In evaluations on BioAgentBench and LAB-Bench, BioManus improved execution accuracy, workflow validity, and context efficiency compared to existing biomedical agents, suggesting a paradigm shift toward structured capability graphs for scalable biomedical reasoning.

arXiv:2606.04494v1 Announce Type: new Abstract: Biomedical agents promise to automate complex biological workflows, yet current systems face two fundamental bottlenecks: bioinformatics tools are highly heterogeneous in interfaces and execution environments, while agent planning still relies on flat prompt-retrieved tool descriptions. As biomedical software ecosystems grow, this coupling between tool coverage and context size leads to tool confusion, unstable planning, and inefficient execution. We introduce BioManus, an MCP-native biomedical agent built on graph-scaffolded planning over structured biological capabilities. BioManus first introduces the BioinfoMCP Compiler, which converts heterogeneous bioinformatics software into standardized MCP servers, yielding a large executable MCP ecosystem. It then organizes this ecosystem as a typed heterogeneous MCP graph over tools, operations, datatypes, and workflow stages. At inference time, BioManus retrieves compact task-specific subgraphs, synthesizes operation-level workflow scaffolds. This design decouples planning complexity from raw tool inventory size, achieving a context compression ratio of Theta N / h m bar under high-recall retrieval, where N is the total tool count, h is the workflow horizon, and m bar much smaller than N is the average number of candidate tools per operation. Experiments on BioAgentBench and LAB-Bench show that BioManus improves execution accuracy, workflow validity, and context efficiency over advanced biomedical agent baselines. This work suggests a paradigm shift: scalable biomedical reasoning requires structured executable capability graphs rather than increasingly larger prompt-level tool retrieval.