DeerFlow 2.0 Review: ByteDance's Open SuperAgent Harness

wpnews.pro

Originally published on— visit the original for any updates, code snippets that aged out, or follow-up posts.[andrew.ooo]

DeerFlow 2.0 is ByteDance's open-source "SuperAgent harness" — a long-horizon agent runtime that orchestrates sub-agents, sandboxes, persistent memory, and an extensible skill system to run tasks that take minutes to hours. It hit 74,960 stars with 3,329 new this week, and held the #1 GitHub Trending spot on February 28th, 2026 after the v2 launch.

/api/langgraph/*

paths so existing LangGraph clients work unchangedIf you've been building agent workflows on top of raw LangGraph or AutoGen and hit the wall where "who owns the sandbox? Who owns memory? Who owns the message bus?" becomes a six-month engineering project — DeerFlow has already made those decisions. Whether you agree with them is the question.

Field	Value
Repo

git clone … && make setup

(2-minute wizard)http://localhost:2026

If you've spent any time trying to build a "real" agent system in 2026, you know the pattern: start with a single LLM call. Add tool calling. Add memory — which kind? Conversation buffer? Vector store? Knowledge graph? Add a sandbox — host or Docker or Firecracker? Add sub-agents — now you need a message bus, run state, and somebody to own cancellation. Six months later you have a worse version of LangGraph + Open Interpreter + AutoGen glued together with duct tape.

DeerFlow 2.0's pitch is: stop building the harness. Use this one. We made the decisions.

skills/

(configurable via DEER_FLOW_SKILLS_PATH

), and the agent picks them up automatically.The maintainers' "one-line agent setup" is genuinely cute — they wrote a prompt designed to be pasted into your coding agent of choice:

Help me clone DeerFlow if needed, then bootstrap it for local development by
following https://raw.githubusercontent.com/bytedance/deer-flow/main/Install.md

For humans:

git clone https://github.com/bytedance/deer-flow.git
cd deer-flow

make setup

make doctor

make docker-init   # Pulls sandbox image — once
make docker-start  # Starts services

open http://localhost:2026

The make setup

wizard walks you through:

.env

with keys and a minimal config.yaml

If you'd rather edit YAML directly, make config

copies the full template.

This is where DeerFlow earns the "harness" label. The config.yaml

model block handles four provider patterns cleanly: standard OpenAI/Anthropic, OpenAI-compatible gateways (OpenRouter), the OpenAI Responses API for GPT-5 reasoning, and local vLLM with reasoning support.

models:
  - name: gpt-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY

  - name: openrouter-gemini-2.5-flash
    use: langchain_openai:ChatOpenAI
    model: google/gemini-2.5-flash-preview
    api_key: $OPENROUTER_API_KEY
    base_url: https://openrouter.ai/api/v1

  - name: qwen3-32b-vllm
    use: deerflow.models.vllm_provider:VllmChatModel
    model: Qwen/Qwen3-32B
    base_url: http://localhost:8000/v1
    supports_thinking: true

The most novel piece is the CLI-backed providers. You can plug DeerFlow into your existing Codex CLI or Claude Code OAuth so it reuses your subscription quota instead of charging another API key:

  - name: claude-sonnet-4.6
    use: deerflow.models.claude_provider:ClaudeChatModel
    model: claude-sonnet-4-6
    supports_thinking: true

Claude Code reads ~/.claude/.credentials.json

; Codex CLI reads ~/.codex/auth.json

. On macOS, Claude Code auth sometimes needs an explicit eval "$(python3 scripts/export_claude_code_oauth.py --print-export)"

— a thoughtful detail that tells you the team actually ran this on a Mac.

When you make docker-start

, you get:

GATEWAY_WORKERS=1

).:2026

, hot-reload in dev modeconfig.yaml

uses provisioner mode (sandbox.use: deerflow.community.aio_sandbox:AioSandboxProvider

)The single-worker constraint deserves attention. From the README:

The Gateway holds run state (RunManager and the stream bridge) in process, so production defaults to a single Gateway worker (

GATEWAY_WORKERS=1

). Raising the worker count without a shared cross-worker stream bridge — which is not yet available — breaks run cancellation, SSE reconnects, request de-duplication, and IM channels.

Translation: scale vertically, not horizontally. Throw more CPU and RAM at one Gateway worker; don't try to run N replicas behind a load balancer. This is fine for a team of 5–50 developers. It's a hard ceiling for "give DeerFlow to 10,000 users."

The skill system is the part that feels most differentiated from competitors like AutoGen or CrewAI.

skills/

by default (override via DEER_FLOW_SKILLS_PATH

)Sub-agents follow a similar logic:

kill

a top-level run and all sub-agents die cleanlyThe "hours-long task" promise is real if you give it the right model (Doubao-Seed-2.0-Code, DeepSeek v3.2, or Kimi 2.5 per the README) and a generous sandbox. With smaller models, it still works, but the failure modes get more interesting.

Three sandbox modes ship in-tree:

ByteDance is unusually honest about hardware needs:

Deployment	Starting point	Recommended
Local eval (`make dev` )
4 vCPU / 8 GB / 20 GB SSD	8 vCPU / 16 GB
Docker dev (`make docker-start` )
4 vCPU / 8 GB / 25 GB SSD	8 vCPU / 16 GB
Long-running server (`make up` )
8 vCPU / 16 GB / 40 GB SSD	16 vCPU / 32 GB

And from the README: "These numbers cover DeerFlow itself. If you also host a local LLM, size that service separately." If you're running a 32B model locally and the harness on the same box, expect a beefy workstation. This isn't a Raspberry Pi project.

The README also has an explicit security section: Improper Deployment May Introduce Security Risks. Which is a polite way of saying: this is a Docker-out-of-Docker shaped object with file-write tools and bash access. Don't expose it to the public internet without auth.

That last verdict is the most accurate. DeerFlow gives you the harness. It does not give you a finished product. You still bring the model, the skills, the integration code, and the patience to wait for hour-long runs.

After working through the README and tracking community feedback, here are the rough edges to budget for:

make docker-init

on the first tryUV_INDEX_URL

and NPM_REGISTRY

manuallycmd.exe

/PowerShell shells aren't supported, and WSL is "not guaranteed" because some scripts rely on Git for Windows utilities like cygpath

GATEWAY_CORS_ORIGINS

set explicitlybackend/docs/MEMORY_SETTINGS_REVIEW.md

is great; the equivalent for skills is thinnerNone of these are deal-breakers. They're a tax you pay to skip six months of harness engineering.

DeerFlow 2.0	LangGraph (vanilla)	AutoGen	CrewAI	Claude Code
Long-horizon harness
✅ Built-in	❌ DIY	Partial	Partial	✅ Different model
Docker sandbox
✅ Default	❌ DIY	❌ DIY	❌ DIY	✅ Different model
Sub-agents
✅ First-class	✅ Graph nodes	✅ AssistantAgents	✅ Crews	✅ Sub-agents
Persistent memory
✅ File + Settings UI	Partial	Partial	Partial	✅ Different model
Skill plug-ins
✅ Drop-in	❌	❌	❌	✅ AgentSkills
MCP server
✅	Via integration	Via integration	Via integration	✅ Native
IM channels
✅ Lark, Slack	❌	❌	❌	❌
Best for
Self-hosted long-horizon agents	Custom workflows you fully own	Multi-agent conversations	Role-based teams	Coding tasks on your machine

Claude Code is in a different category — it's a coding agent for individual developers, not a self-hostable harness. DeerFlow is the self-hostable harness that talks to Claude Code (via OAuth) so you can hand it long-horizon goals from inside your own infra.

Good fit:

Bad fit:

Q: Is DeerFlow 2.0 compatible with DeerFlow 1.x configs?

No. v2 is a ground-up rewrite that shares zero code with v1. The original Deep Research framework is maintained on the main-1.x

branch and still accepts contributions, but active development has moved to 2.0.

Q: Do I have to use Doubao / DeepSeek / Kimi?

No, but the README explicitly recommends them: "We strongly recommend using Doubao-Seed-2.0-Code, DeepSeek v3.2 and Kimi 2.5 to run DeerFlow." GPT-5, Claude Sonnet 4.6, Gemini 2.5, and local vLLM models all work via the YAML config. Expect to tune prompts a bit if you swap models — the harness was clearly developed against the recommended set.

Q: Can DeerFlow drive my existing Claude Code or Codex CLI?

Yes. The CLI-backed providers (deerflow.models.claude_provider:ClaudeChatModel

and deerflow.models.openai_codex_provider:CodexChatModel

) read your local OAuth credentials (~/.claude/.credentials.json

, ~/.codex/auth.json

) so DeerFlow can call them as ordinary chat models without a separate API key.

Q: Is the sandbox actually secure?

The Docker sandbox is reasonable for development. For production, use the Kubernetes provisioner mode and read the README's "Security Recommendations" section before exposing anything to the network. The local-execution mode is not sandboxed — treat it as "I trust every prompt this model will generate."

Q: Can I scale DeerFlow horizontally behind a load balancer?

Not today. The Gateway holds run state in process and there's no cross-worker stream bridge yet. Scale vertically — more CPU and RAM on one Gateway worker — or shard by team/project across multiple deployments.

Q: How does DeerFlow compare to LangGraph?

DeerFlow uses LangGraph's wire format and exposes a /api/langgraph/*

-compatible HTTP surface, so existing LangGraph clients work unchanged. The difference is everything around the graph: DeerFlow ships the sandbox, the run manager, the message gateway, the skill , the memory store, and the UI. With vanilla LangGraph, you build all of that yourself.

DeerFlow 2.0 is the most complete open-source SuperAgent harness available right now. It's not the easiest to deploy, and it's tied to BytePlus infrastructure in places where ByteDance had reasons to defaults you might not share. But it solves the actual hard problem — who owns the sandbox, the memory, the run state, the message bus — in a way that you'd otherwise spend six months reinventing.

If you've been writing custom LangGraph harnesses for the last year and the maintenance burden is biting, give DeerFlow a weekend. The make setup

wizard is fast. Docker dev mode works on a 16 GB laptop. The architecture decisions, while opinionated, are mostly the right ones.

For larger deployments, watch the cross-worker stream bridge issue. Once that lands, DeerFlow becomes genuinely production-grade for multi-tenant agent platforms.

74K stars in four months is not a coincidence. ByteDance shipped something the open-source agent community actually needed, and they shipped it polished. That's rare.

Try it: github.com/bytedance/deer-flow

Docs: deerflow.tech

License: Apache 2.0

source & further reading

dev.to — original article Cheap AI tokens need request-level receipts Structured Outputs: How We Stopped Parsing LLM Responses by Hand Before you sell an AI connector, map the trust boundary

DeerFlow 2.0 Review: ByteDance's Open SuperAgent Harness

Run your AI side-project on zahid.host