DeerFlow 2.0 Review: ByteDance's Open SuperAgent Harness

ByteDance open-sourced DeerFlow 2.0, a long-horizon agent runtime that orchestrates sub-agents, sandboxes, persistent memory, and an extensible skill system. The project reached 74,960 GitHub stars and held the #1 trending spot on February 28, 2026, after its v2 launch. DeerFlow provides a pre-built harness with configurable models, CLI-backed providers, and a one-line setup wizard.

Originally published on— visit the original for any updates, code snippets that aged out, or follow-up posts. andrew.ooo DeerFlow 2.0 is ByteDance's open-source "SuperAgent harness" — a long-horizon agent runtime that orchestrates sub-agents, sandboxes, persistent memory, and an extensible skill system to run tasks that take minutes to hours. It hit 74,960 stars with 3,329 new this week , and held the 1 GitHub Trending spot on February 28th, 2026 after the v2 launch. /api/langgraph/ paths so existing LangGraph clients work unchangedIf you've been building agent workflows on top of raw LangGraph or AutoGen and hit the wall where "who owns the sandbox? Who owns memory? Who owns the message bus?" becomes a six-month engineering project — DeerFlow has already made those decisions. Whether you agree with them is the question. | Field | Value | |---|---| Repo | | git clone … && make setup 2-minute wizard http://localhost:2026 If you've spent any time trying to build a "real" agent system in 2026, you know the pattern: start with a single LLM call. Add tool calling. Add memory — which kind? Conversation buffer? Vector store? Knowledge graph? Add a sandbox — host or Docker or Firecracker? Add sub-agents — now you need a message bus, run state, and somebody to own cancellation. Six months later you have a worse version of LangGraph https://github.com/langchain-ai/langgraph + Open Interpreter https://github.com/OpenInterpreter/open-interpreter + AutoGen https://github.com/microsoft/autogen glued together with duct tape. DeerFlow 2.0's pitch is: stop building the harness. Use this one. We made the decisions. skills/ configurable via DEER FLOW SKILLS PATH , and the agent picks them up automatically.The maintainers' "one-line agent setup" is genuinely cute — they wrote a prompt designed to be pasted into your coding agent of choice: Help me clone DeerFlow if needed, then bootstrap it for local development by following https://raw.githubusercontent.com/bytedance/deer-flow/main/Install.md For humans: 1. Clone git clone https://github.com/bytedance/deer-flow.git cd deer-flow 2. Run the setup wizard ≈2 min, interactive make setup 3. Verify make doctor 4. Start Docker recommended make docker-init Pulls sandbox image — once make docker-start Starts services 5. Open open http://localhost:2026 The make setup wizard walks you through: .env with keys and a minimal config.yaml If you'd rather edit YAML directly, make config copies the full template. This is where DeerFlow earns the "harness" label. The config.yaml model block handles four provider patterns cleanly: standard OpenAI/Anthropic, OpenAI-compatible gateways OpenRouter , the OpenAI Responses API for GPT-5 reasoning, and local vLLM with reasoning support. models: - name: gpt-4o use: langchain openai:ChatOpenAI model: gpt-4o api key: $OPENAI API KEY - name: openrouter-gemini-2.5-flash use: langchain openai:ChatOpenAI model: google/gemini-2.5-flash-preview api key: $OPENROUTER API KEY base url: https://openrouter.ai/api/v1 - name: qwen3-32b-vllm use: deerflow.models.vllm provider:VllmChatModel model: Qwen/Qwen3-32B base url: http://localhost:8000/v1 supports thinking: true The most novel piece is the CLI-backed providers . You can plug DeerFlow into your existing Codex CLI or Claude Code OAuth so it reuses your subscription quota instead of charging another API key: - name: claude-sonnet-4.6 use: deerflow.models.claude provider:ClaudeChatModel model: claude-sonnet-4-6 supports thinking: true Claude Code reads ~/.claude/.credentials.json ; Codex CLI reads ~/.codex/auth.json . On macOS, Claude Code auth sometimes needs an explicit eval "$ python3 scripts/export claude code oauth.py --print-export " — a thoughtful detail that tells you the team actually ran this on a Mac. When you make docker-start , you get: GATEWAY WORKERS=1 . :2026 , hot-reload in dev mode config.yaml uses provisioner mode sandbox.use: deerflow.community.aio sandbox:AioSandboxProvider The single-worker constraint deserves attention. From the README: The Gateway holds run state RunManager and the stream bridge in process, so production defaults to a single Gateway worker GATEWAY WORKERS=1 . Raising the worker count without a shared cross-worker stream bridge — which is not yet available — breaks run cancellation, SSE reconnects, request de-duplication, and IM channels. Translation: scale vertically, not horizontally. Throw more CPU and RAM at one Gateway worker; don't try to run N replicas behind a load balancer. This is fine for a team of 5–50 developers. It's a hard ceiling for "give DeerFlow to 10,000 users." The skill system is the part that feels most differentiated from competitors like AutoGen or CrewAI. skills/ by default override via DEER FLOW SKILLS PATH Sub-agents follow a similar logic: kill a top-level run and all sub-agents die cleanlyThe "hours-long task" promise is real if you give it the right model Doubao-Seed-2.0-Code, DeepSeek v3.2, or Kimi 2.5 per the README and a generous sandbox. With smaller models, it still works, but the failure modes get more interesting. Three sandbox modes ship in-tree: ByteDance is unusually honest about hardware needs: | Deployment | Starting point | Recommended | |---|---|---| Local eval make dev | 4 vCPU / 8 GB / 20 GB SSD | 8 vCPU / 16 GB | Docker dev make docker-start | 4 vCPU / 8 GB / 25 GB SSD | 8 vCPU / 16 GB | Long-running server make up | 8 vCPU / 16 GB / 40 GB SSD | 16 vCPU / 32 GB | And from the README: "These numbers cover DeerFlow itself. If you also host a local LLM, size that service separately." If you're running a 32B model locally and the harness on the same box, expect a beefy workstation. This isn't a Raspberry Pi project. The README also has an explicit security section: Improper Deployment May Introduce Security Risks. Which is a polite way of saying: this is a Docker-out-of-Docker shaped object with file-write tools and bash access. Don't expose it to the public internet without auth. That last verdict is the most accurate. DeerFlow gives you the harness. It does not give you a finished product. You still bring the model, the skills, the integration code, and the patience to wait for hour-long runs. After working through the README and tracking community feedback, here are the rough edges to budget for: make docker-init on the first try UV INDEX URL and NPM REGISTRY manually cmd.exe /PowerShell shells aren't supported, and WSL is "not guaranteed" because some scripts rely on Git for Windows utilities like cygpath GATEWAY CORS ORIGINS set explicitly backend/docs/MEMORY SETTINGS REVIEW.md is great; the equivalent for skills is thinnerNone of these are deal-breakers. They're a tax you pay to skip six months of harness engineering. | DeerFlow 2.0 | LangGraph vanilla | AutoGen | CrewAI | Claude Code | | |---|---|---|---|---|---| Long-horizon harness | ✅ Built-in | ❌ DIY | Partial | Partial | ✅ Different model | Docker sandbox | ✅ Default | ❌ DIY | ❌ DIY | ❌ DIY | ✅ Different model | Sub-agents | ✅ First-class | ✅ Graph nodes | ✅ AssistantAgents | ✅ Crews | ✅ Sub-agents | Persistent memory | ✅ File + Settings UI | Partial | Partial | Partial | ✅ Different model | Skill plug-ins | ✅ Drop-in | ❌ | ❌ | ❌ | ✅ AgentSkills | MCP server | ✅ | Via integration | Via integration | Via integration | ✅ Native | IM channels | ✅ Lark, Slack | ❌ | ❌ | ❌ | ❌ | Best for | Self-hosted long-horizon agents | Custom workflows you fully own | Multi-agent conversations | Role-based teams | Coding tasks on your machine | Claude Code is in a different category — it's a coding agent for individual developers, not a self-hostable harness. DeerFlow is the self-hostable harness that talks to Claude Code via OAuth so you can hand it long-horizon goals from inside your own infra. Good fit: Bad fit: Q: Is DeerFlow 2.0 compatible with DeerFlow 1.x configs? No. v2 is a ground-up rewrite that shares zero code with v1. The original Deep Research framework is maintained on the main-1.x branch and still accepts contributions, but active development has moved to 2.0. Q: Do I have to use Doubao / DeepSeek / Kimi? No, but the README explicitly recommends them: "We strongly recommend using Doubao-Seed-2.0-Code, DeepSeek v3.2 and Kimi 2.5 to run DeerFlow." GPT-5, Claude Sonnet 4.6, Gemini 2.5, and local vLLM models all work via the YAML config. Expect to tune prompts a bit if you swap models — the harness was clearly developed against the recommended set. Q: Can DeerFlow drive my existing Claude Code or Codex CLI? Yes. The CLI-backed providers deerflow.models.claude provider:ClaudeChatModel and deerflow.models.openai codex provider:CodexChatModel read your local OAuth credentials ~/.claude/.credentials.json , ~/.codex/auth.json so DeerFlow can call them as ordinary chat models without a separate API key. Q: Is the sandbox actually secure? The Docker sandbox is reasonable for development. For production, use the Kubernetes provisioner mode and read the README's "Security Recommendations" section before exposing anything to the network. The local-execution mode is not sandboxed — treat it as "I trust every prompt this model will generate." Q: Can I scale DeerFlow horizontally behind a load balancer? Not today. The Gateway holds run state in process and there's no cross-worker stream bridge yet. Scale vertically — more CPU and RAM on one Gateway worker — or shard by team/project across multiple deployments. Q: How does DeerFlow compare to LangGraph? DeerFlow uses LangGraph's wire format and exposes a /api/langgraph/ -compatible HTTP surface, so existing LangGraph clients work unchanged. The difference is everything around the graph: DeerFlow ships the sandbox, the run manager, the message gateway, the skill loader, the memory store, and the UI. With vanilla LangGraph, you build all of that yourself. DeerFlow 2.0 is the most complete open-source SuperAgent harness available right now. It's not the easiest to deploy, and it's tied to BytePlus infrastructure in places where ByteDance had reasons to defaults you might not share. But it solves the actual hard problem — who owns the sandbox, the memory, the run state, the message bus — in a way that you'd otherwise spend six months reinventing. If you've been writing custom LangGraph harnesses for the last year and the maintenance burden is biting, give DeerFlow a weekend. The make setup wizard is fast. Docker dev mode works on a 16 GB laptop. The architecture decisions, while opinionated, are mostly the right ones. For larger deployments, watch the cross-worker stream bridge issue. Once that lands, DeerFlow becomes genuinely production-grade for multi-tenant agent platforms. 74K stars in four months is not a coincidence. ByteDance shipped something the open-source agent community actually needed, and they shipped it polished. That's rare. Try it: github.com/bytedance/deer-flow https://github.com/bytedance/deer-flow Docs: deerflow.tech https://deerflow.tech License: Apache 2.0