# Open Source Project of the Day (#104): AgentScope 2.0 — Alibaba's Production-Ready Agent Framework Built Around Model Reasoning

> Source: <https://dev.to/wonderlab/open-source-project-of-the-day-104-agentscope-20-alibabas-production-ready-agent-framework-4o4d>
> Published: 2026-06-24 03:35:23+00:00

"Build and run agents you can see, understand, and trust."

This is article **#104** in the *Open Source Project of the Day* series. Today's project is **AgentScope 2.0** — Alibaba DAMO Academy's open-source production-ready agent framework.

The agent framework space is crowded. LangChain centers on chain-based orchestration. AutoGen centers on multi-agent conversation. CrewAI centers on role-based collaboration. AgentScope's differentiation is in its design philosophy: when LLM reasoning is strong enough, the framework should step back rather than constraining the model's decision space with rigid pipelines.

AgentScope 2.0 adds the production infrastructure that philosophy requires: event system, permission controls, multi-tenant isolation, sandbox execution, middleware hooks. The goal is not a demo that runs — it's a system that ships.

AgentScope 2.0 is a production-ready agent framework — "an agent development platform with essential abstractions, designed to work with rising model capability, with built-in production support."

The core problem it addresses: traditional agent frameworks constrain LLMs with rigid pipelines and opinionated prompt templates. As LLM reasoning capability has improved rapidly, that constraint has become a bottleneck. AgentScope shifts to "letting the model's native reasoning and tool-use capabilities drive agent behavior" — the framework provides production infrastructure, not execution path constraints.

The minimum working unit in AgentScope 2.0 is an `Agent`

, extended by composing systems:

``` python
import asyncio
from agentscope import Agent, Toolkit, DashScopeChatModel, DashScopeCredential
from agentscope.tools import Bash, Grep, Glob, Read, Write
from agentscope.message import UserMsg

# Define a toolkit
toolkit = Toolkit(tools=[Bash(), Grep(), Glob(), Read(), Write()])

# Create an agent
agent = Agent(
    name="code-assistant",
    system_prompt="You are a code assistant that helps users analyze and modify codebases.",
    model=DashScopeChatModel(
        credential=DashScopeCredential(api_key="your_key"),
        model="qwen3.6-plus"
    ),
    toolkit=toolkit
)

# Streaming reasoning loop
async def run():
    async for evt in agent.reply_stream(UserMsg("user", "Analyze the structure of this codebase")):
        match evt.type:
            case EventType.TEXT_BLOCK_DELTA:
                print(evt.delta, end="", flush=True)
            case EventType.TOOL_CALL_START:
                print(f"\n[Tool call] {evt.tool_name}")

asyncio.run(run())
```

**1. Event System**

A unified event bus connecting all phases of the agent's reasoning process:

```
EventType.REPLY_START          # Agent begins responding
EventType.MODEL_CALL_START     # Model call initiated
EventType.TEXT_BLOCK_START     # Text block starts
EventType.TEXT_BLOCK_DELTA     # Streaming text delta
EventType.TEXT_BLOCK_END       # Text block complete
EventType.TOOL_CALL_START      # Tool call initiated
EventType.TOOL_CALL_END        # Tool call complete
```

Human-in-the-loop workflows attach through the event system: pause the agent on a specific event, wait for human confirmation, resume execution.

**2. Permission System**

Fine-grained control over which tool calls require approval vs. automatic execution:

``` python
from agentscope.permission import PermissionConfig, ApprovalMode

config = PermissionConfig(
    # File writes require confirmation
    Write: ApprovalMode.ALWAYS,
    # Shell execution requires confirmation
    Bash: ApprovalMode.ALWAYS,
    # Reads are automatic
    Read: ApprovalMode.NEVER,
    # Operations over $0.10 require confirmation
    default_cost_threshold=0.10
)
```

**Permission Bypass Mode**: For testing or trusted scenarios, disable all approvals and let the agent run fully autonomously.

**3. Multi-Tenancy / Session Isolation**

The FastAPI service layer provides production-grade tenant and session isolation:

**4. Workspace / Sandbox Execution**

Three backend options for isolated tool execution:

| Backend | Best for |
|---|---|
| Local | Development and testing, fastest |
| Docker | Production, dependency isolation |
| E2B | Cloud sandbox, highest security |

**5. Middleware System**

Insert composable hooks into the agent's reasoning-acting loop without modifying core agent code:

``` python
from agentscope.middleware import LoggingMiddleware, GuardrailMiddleware

agent = Agent(
    ...
    middlewares=[
        LoggingMiddleware(log_tool_calls=True),
        GuardrailMiddleware(blocked_patterns=["rm -rf", "DROP TABLE"]),
    ]
)
```

Leader-Worker pattern: a Leader Agent decomposes tasks and creates Worker agents via built-in team tools, then aggregates results.

``` python
from agentscope.tools import TeamTools

# Leader has team_tools — can create and coordinate workers
leader = Agent(
    name="research-leader",
    system_prompt="You lead a research team. Decompose tasks and synthesize results.",
    model=model,
    toolkit=Toolkit(tools=[*TeamTools()])
)

# At runtime, the leader automatically decomposes:
# "Analyze the core arguments of these 5 papers"
# → Creates 5 workers, one per paper
# → Aggregates results
```

Worker agents' capabilities are determined dynamically by the leader at runtime — no need to predefine all possible worker types.

Agents decompose complex tasks into tracked plan steps, updating state in real time as execution proceeds:

```
Task: "Write a complete test suite for this Python project"
Agent generates plan:
  Step 1: [In progress] Scan project structure, identify all modules
  Step 2: [Waiting]     Analyze public API of each module
  Step 3: [Waiting]     Generate unit tests
  Step 4: [Waiting]     Generate integration tests
  Step 5: [Waiting]     Run test suite, fix failures

Step 1 completes → Step 2 starts automatically, plan state updates
```

Long-running tool calls (file processing, network requests, code compilation) shift to background without blocking the agent conversation stream:

```
User: "Compile this large C++ project and run the tests"
Agent: [Launches background task, continues conversation immediately]
Agent: "Compilation started in background, estimated 5 minutes.
        I can help with other things while you wait."
...(5 minutes later)
System notification: background task complete
Agent: "Compilation complete. Test results: ..."
```

This is the most fundamental difference between AgentScope 2.0 and many comparable frameworks:

**Traditional approach** (LangChain-style):

```
Developer defines a fixed chain:
Step 1 → Step 2 → Step 3 (developer decides what happens at each step)
The model fills in blanks within each step
```

**AgentScope approach:**

```
Developer provides: toolkit + permissions + constraints
Model decides:      what to do, in what order, with which tools
Framework handles:  production safety, observability, human-in-the-loop
```

When model reasoning was weak, fixed pipelines were correct — models needed guidance. When model reasoning is strong enough, fixed pipelines become constraints — the model has better plans it can't execute. AgentScope 2.0's timing judgment: mainstream models from 2025 onward are capable enough to deserve more autonomy.

The standard `async for evt in agent.reply_stream()`

pattern enables:

A separate AgentScope Runtime (runtime.agentscope.io) provides a complete production service layer:

AgentScope is not just a framework — there's a complete toolchain behind it:

| Component | Function |
|---|---|
AgentScope Studio |
Visual debugging tool for agent runs |
ReMe |
Cross-session persistent memory (file-based + vector-based) |
OpenJudge |
50+ judges (code, math, tool use, multimodal output) |
Trinity-RFT |
Agent fine-tuning framework (decoupled Explorer/Trainer/Buffer) |
Mem0 integration |
Long-term memory (added June 2026) |

| Dimension | LangChain | AutoGen | AgentScope 2.0 |
|---|---|---|---|
| Core pattern | Chain-based | Multi-agent conversation | Model-reasoning-led |
| Production infra | Third-party | Third-party | Built-in |
| Sandbox execution | None | Limited | Local / Docker / E2B |
| Human-in-the-loop | Plugin | Native | Event system native |
| Evaluation system | None | None | OpenJudge (50+ judges) |
| Fine-tuning support | None | None | Trinity-RFT |
| Academic backing | Yes | Yes | Yes (2 arXiv papers) |

The most significant gap: AgentScope covers the full agent lifecycle — framework → memory → evaluation → fine-tuning → apps. LangChain and AutoGen stop at the framework and memory layers.

**Install:**

```
pip install agentscope
```

**Or from source:**

```
git clone https://github.com/agentscope-ai/agentscope.git
pip install -e .
```

**Run the web UI:**

```
cd agentscope
pnpm install && pnpm run dev   # frontend
python -m agentscope.service   # backend
```

AgentScope 2.0's timing is deliberate: at a moment when LLM reasoning capability is advancing fast, it chooses "reduce framework constraints, let the model lead" as its direction.

The five core systems (Event / Permission / Workspace / Multi-tenancy / Middleware) address the production pain points of traditional frameworks: poor observability, no fine-grained tool permission control, difficulty serving multiple users, and security constraints mixed into business logic.

The ecosystem coverage is what separates it most clearly. Framework → memory → evaluation → fine-tuning is a complete chain that LangChain and AutoGen haven't built. OpenJudge alone — 50+ judges covering code, math, tool use, and multimodal output — fills a gap that most teams solve by writing evaluation scripts from scratch.

27.1k Stars, 40 releases, two arXiv papers, and an Alibaba engineering team behind it. Among production-grade agent frameworks, AgentScope 2.0 is one of the most thorough options currently available.

*Explore PrimeSkills — A marketplace for handpicked AI Agents and skills. Each is validated in real enterprise workflows, stripping away hype and keeping only what truly works.*

*Welcome to my Homepage for more useful insights and interesting products.*