{"slug": "open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready", "title": "Open Source Project of the Day (#104): AgentScope 2.0 — Alibaba's Production-Ready Agent Framework Built Around Model Reasoning", "summary": "Alibaba DAMO Academy released AgentScope 2.0, an open-source production-ready agent framework designed to leverage LLM reasoning without rigid pipeline constraints. The framework adds event systems, permission controls, multi-tenant isolation, and sandbox execution for shipping reliable agent systems.", "body_md": "\"Build and run agents you can see, understand, and trust.\"\n\nThis is article **#104** in the *Open Source Project of the Day* series. Today's project is **AgentScope 2.0** — Alibaba DAMO Academy's open-source production-ready agent framework.\n\nThe agent framework space is crowded. LangChain centers on chain-based orchestration. AutoGen centers on multi-agent conversation. CrewAI centers on role-based collaboration. AgentScope's differentiation is in its design philosophy: when LLM reasoning is strong enough, the framework should step back rather than constraining the model's decision space with rigid pipelines.\n\nAgentScope 2.0 adds the production infrastructure that philosophy requires: event system, permission controls, multi-tenant isolation, sandbox execution, middleware hooks. The goal is not a demo that runs — it's a system that ships.\n\nAgentScope 2.0 is a production-ready agent framework — \"an agent development platform with essential abstractions, designed to work with rising model capability, with built-in production support.\"\n\nThe core problem it addresses: traditional agent frameworks constrain LLMs with rigid pipelines and opinionated prompt templates. As LLM reasoning capability has improved rapidly, that constraint has become a bottleneck. AgentScope shifts to \"letting the model's native reasoning and tool-use capabilities drive agent behavior\" — the framework provides production infrastructure, not execution path constraints.\n\nThe minimum working unit in AgentScope 2.0 is an `Agent`\n\n, extended by composing systems:\n\n``` python\nimport asyncio\nfrom agentscope import Agent, Toolkit, DashScopeChatModel, DashScopeCredential\nfrom agentscope.tools import Bash, Grep, Glob, Read, Write\nfrom agentscope.message import UserMsg\n\n# Define a toolkit\ntoolkit = Toolkit(tools=[Bash(), Grep(), Glob(), Read(), Write()])\n\n# Create an agent\nagent = Agent(\n    name=\"code-assistant\",\n    system_prompt=\"You are a code assistant that helps users analyze and modify codebases.\",\n    model=DashScopeChatModel(\n        credential=DashScopeCredential(api_key=\"your_key\"),\n        model=\"qwen3.6-plus\"\n    ),\n    toolkit=toolkit\n)\n\n# Streaming reasoning loop\nasync def run():\n    async for evt in agent.reply_stream(UserMsg(\"user\", \"Analyze the structure of this codebase\")):\n        match evt.type:\n            case EventType.TEXT_BLOCK_DELTA:\n                print(evt.delta, end=\"\", flush=True)\n            case EventType.TOOL_CALL_START:\n                print(f\"\\n[Tool call] {evt.tool_name}\")\n\nasyncio.run(run())\n```\n\n**1. Event System**\n\nA unified event bus connecting all phases of the agent's reasoning process:\n\n```\nEventType.REPLY_START          # Agent begins responding\nEventType.MODEL_CALL_START     # Model call initiated\nEventType.TEXT_BLOCK_START     # Text block starts\nEventType.TEXT_BLOCK_DELTA     # Streaming text delta\nEventType.TEXT_BLOCK_END       # Text block complete\nEventType.TOOL_CALL_START      # Tool call initiated\nEventType.TOOL_CALL_END        # Tool call complete\n```\n\nHuman-in-the-loop workflows attach through the event system: pause the agent on a specific event, wait for human confirmation, resume execution.\n\n**2. Permission System**\n\nFine-grained control over which tool calls require approval vs. automatic execution:\n\n``` python\nfrom agentscope.permission import PermissionConfig, ApprovalMode\n\nconfig = PermissionConfig(\n    # File writes require confirmation\n    Write: ApprovalMode.ALWAYS,\n    # Shell execution requires confirmation\n    Bash: ApprovalMode.ALWAYS,\n    # Reads are automatic\n    Read: ApprovalMode.NEVER,\n    # Operations over $0.10 require confirmation\n    default_cost_threshold=0.10\n)\n```\n\n**Permission Bypass Mode**: For testing or trusted scenarios, disable all approvals and let the agent run fully autonomously.\n\n**3. Multi-Tenancy / Session Isolation**\n\nThe FastAPI service layer provides production-grade tenant and session isolation:\n\n**4. Workspace / Sandbox Execution**\n\nThree backend options for isolated tool execution:\n\n| Backend | Best for |\n|---|---|\n| Local | Development and testing, fastest |\n| Docker | Production, dependency isolation |\n| E2B | Cloud sandbox, highest security |\n\n**5. Middleware System**\n\nInsert composable hooks into the agent's reasoning-acting loop without modifying core agent code:\n\n``` python\nfrom agentscope.middleware import LoggingMiddleware, GuardrailMiddleware\n\nagent = Agent(\n    ...\n    middlewares=[\n        LoggingMiddleware(log_tool_calls=True),\n        GuardrailMiddleware(blocked_patterns=[\"rm -rf\", \"DROP TABLE\"]),\n    ]\n)\n```\n\nLeader-Worker pattern: a Leader Agent decomposes tasks and creates Worker agents via built-in team tools, then aggregates results.\n\n``` python\nfrom agentscope.tools import TeamTools\n\n# Leader has team_tools — can create and coordinate workers\nleader = Agent(\n    name=\"research-leader\",\n    system_prompt=\"You lead a research team. Decompose tasks and synthesize results.\",\n    model=model,\n    toolkit=Toolkit(tools=[*TeamTools()])\n)\n\n# At runtime, the leader automatically decomposes:\n# \"Analyze the core arguments of these 5 papers\"\n# → Creates 5 workers, one per paper\n# → Aggregates results\n```\n\nWorker agents' capabilities are determined dynamically by the leader at runtime — no need to predefine all possible worker types.\n\nAgents decompose complex tasks into tracked plan steps, updating state in real time as execution proceeds:\n\n```\nTask: \"Write a complete test suite for this Python project\"\nAgent generates plan:\n  Step 1: [In progress] Scan project structure, identify all modules\n  Step 2: [Waiting]     Analyze public API of each module\n  Step 3: [Waiting]     Generate unit tests\n  Step 4: [Waiting]     Generate integration tests\n  Step 5: [Waiting]     Run test suite, fix failures\n\nStep 1 completes → Step 2 starts automatically, plan state updates\n```\n\nLong-running tool calls (file processing, network requests, code compilation) shift to background without blocking the agent conversation stream:\n\n```\nUser: \"Compile this large C++ project and run the tests\"\nAgent: [Launches background task, continues conversation immediately]\nAgent: \"Compilation started in background, estimated 5 minutes.\n        I can help with other things while you wait.\"\n...(5 minutes later)\nSystem notification: background task complete\nAgent: \"Compilation complete. Test results: ...\"\n```\n\nThis is the most fundamental difference between AgentScope 2.0 and many comparable frameworks:\n\n**Traditional approach** (LangChain-style):\n\n```\nDeveloper defines a fixed chain:\nStep 1 → Step 2 → Step 3 (developer decides what happens at each step)\nThe model fills in blanks within each step\n```\n\n**AgentScope approach:**\n\n```\nDeveloper provides: toolkit + permissions + constraints\nModel decides:      what to do, in what order, with which tools\nFramework handles:  production safety, observability, human-in-the-loop\n```\n\nWhen model reasoning was weak, fixed pipelines were correct — models needed guidance. When model reasoning is strong enough, fixed pipelines become constraints — the model has better plans it can't execute. AgentScope 2.0's timing judgment: mainstream models from 2025 onward are capable enough to deserve more autonomy.\n\nThe standard `async for evt in agent.reply_stream()`\n\npattern enables:\n\nA separate AgentScope Runtime (runtime.agentscope.io) provides a complete production service layer:\n\nAgentScope is not just a framework — there's a complete toolchain behind it:\n\n| Component | Function |\n|---|---|\nAgentScope Studio |\nVisual debugging tool for agent runs |\nReMe |\nCross-session persistent memory (file-based + vector-based) |\nOpenJudge |\n50+ judges (code, math, tool use, multimodal output) |\nTrinity-RFT |\nAgent fine-tuning framework (decoupled Explorer/Trainer/Buffer) |\nMem0 integration |\nLong-term memory (added June 2026) |\n\n| Dimension | LangChain | AutoGen | AgentScope 2.0 |\n|---|---|---|---|\n| Core pattern | Chain-based | Multi-agent conversation | Model-reasoning-led |\n| Production infra | Third-party | Third-party | Built-in |\n| Sandbox execution | None | Limited | Local / Docker / E2B |\n| Human-in-the-loop | Plugin | Native | Event system native |\n| Evaluation system | None | None | OpenJudge (50+ judges) |\n| Fine-tuning support | None | None | Trinity-RFT |\n| Academic backing | Yes | Yes | Yes (2 arXiv papers) |\n\nThe most significant gap: AgentScope covers the full agent lifecycle — framework → memory → evaluation → fine-tuning → apps. LangChain and AutoGen stop at the framework and memory layers.\n\n**Install:**\n\n```\npip install agentscope\n```\n\n**Or from source:**\n\n```\ngit clone https://github.com/agentscope-ai/agentscope.git\npip install -e .\n```\n\n**Run the web UI:**\n\n```\ncd agentscope\npnpm install && pnpm run dev   # frontend\npython -m agentscope.service   # backend\n```\n\nAgentScope 2.0's timing is deliberate: at a moment when LLM reasoning capability is advancing fast, it chooses \"reduce framework constraints, let the model lead\" as its direction.\n\nThe five core systems (Event / Permission / Workspace / Multi-tenancy / Middleware) address the production pain points of traditional frameworks: poor observability, no fine-grained tool permission control, difficulty serving multiple users, and security constraints mixed into business logic.\n\nThe ecosystem coverage is what separates it most clearly. Framework → memory → evaluation → fine-tuning is a complete chain that LangChain and AutoGen haven't built. OpenJudge alone — 50+ judges covering code, math, tool use, and multimodal output — fills a gap that most teams solve by writing evaluation scripts from scratch.\n\n27.1k Stars, 40 releases, two arXiv papers, and an Alibaba engineering team behind it. Among production-grade agent frameworks, AgentScope 2.0 is one of the most thorough options currently available.\n\n*Explore PrimeSkills — A marketplace for handpicked AI Agents and skills. Each is validated in real enterprise workflows, stripping away hype and keeping only what truly works.*\n\n*Welcome to my Homepage for more useful insights and interesting products.*", "url": "https://wpnews.pro/news/open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready", "canonical_source": "https://dev.to/wonderlab/open-source-project-of-the-day-104-agentscope-20-alibabas-production-ready-agent-framework-4o4d", "published_at": "2026-06-24 03:35:23+00:00", "updated_at": "2026-06-24 03:43:34.758803+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "developer-tools"], "entities": ["Alibaba DAMO Academy", "AgentScope", "LangChain", "AutoGen", "CrewAI"], "alternates": {"html": "https://wpnews.pro/news/open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready", "markdown": "https://wpnews.pro/news/open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready.md", "text": "https://wpnews.pro/news/open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready.txt", "jsonld": "https://wpnews.pro/news/open-source-project-of-the-day-104-agentscope-2-0-alibaba-s-production-ready.jsonld"}}