A hands-on roadmap for building production-ready AI Agents, MCP Servers, Memory Systems, Multi-Agent Workflows, and Agent Colonies.
繁體中文 · Website · Course · Roadmap · Examples · Showcases · Benchmarks · Labs · Teaching · Templates · Architecture · Healthcare · Finance
flowchart LR
User((User)) --> Agent[AI Agent]
Agent --> Tools[Tool Use]
Tools --> MCP[MCP Layer]
MCP --> Memory[Memory System]
Memory --> Workflow[Agent Workflow]
Workflow --> MultiAgent[Multi-Agent Team]
MultiAgent --> Colony[Agent Colony]
Colony --> Production[Production AI App]
Most AI tutorials stop at prompts, RAG, or simple tool calling.
Real agentic products require more than that:
- agents that can use tools safely
- MCP servers that connect agents to real systems
- memory layers that persist useful context
- workflows that are observable and controllable
- multi-agent teams that can specialize and collaborate
- evaluation, security, and production guardrails
This repository is a practical learning path for builders who want to move from chatbot demos to real agent engineering.
This roadmap teaches agents like an engineering course, not a tool catalog.
Each major topic follows the same pattern:
- Start with the problem: what breaks if you only use a chatbot?
- Build the intuition: what is the simplest mental model?
- Open the box: what components are actually involved?
- Run a minimal example: what can you inspect locally?
- Add production judgment: what needs evaluation, observability, approval, or safety gates?
In one sentence: an agent is not magic. It is context, tools, memory, workflow, evaluation, and human judgment arranged around a useful task.
| Level | Topic | Outcome |
|---|---|---|
| 0 | AI & LLM Fundamentals | Understand LLM apps, embeddings, RAG, and structured output |
| 1 | Single Agent | Build a task-focused agent with a clear role and output format |
| 2 | Tool Use | Connect agents to external tools and APIs |
| 3 | MCP | Build and use MCP clients, servers, tools, resources, and prompts |
| 4 | Agent Memory | Design short-term, episodic, semantic, user, and shared memory |
| 5 | Agent Workflow | Build reliable planning, execution, review, retry, and approval flows |
| 6 | Multi-Agent Systems | Coordinate specialized agents using supervisor, debate, and reflection patterns |
| 7 | Agent Colony | Build shared-memory colonies with domain agents and evaluation loops |
| 8 | Production & Safety | Deploy agents with observability, evaluation, security, and cost control |
| Section | Purpose |
|---|---|
CurriculumVisual AssetsRoadmapExamplesBenchmarksShowcasesDomain CasebooksLabsTeaching LayerLab Solution GuidesLesson PlansStudy Group KitPatternsTemplatesPapersOpen Source ProjectsFramework Selection MatrixOpen Source Reading GuideDeepEval And RAGASRelease ChecklistAssessmentsCapstonePortfolio ProjectsCapstone StarterGlossary
AI Fundamentals
↓
Single Agent
↓
Tool Use
↓
MCP Integration
↓
Agent Memory
↓
Agent Workflow
↓
Multi-Agent Systems
↓
Agent Colony
↓
Production, Evaluation & Safety
Run a showcase without API keys:
python showcases/enterprise-support-agent/main.py
python showcases/finance-research-agent/main.py
python showcases/healthcare-agent-colony/main.py
Then run the evaluation harness:
python examples/07-evaluation-harness/main.py
python examples/08-mini-rag/main.py
python benchmarks/benchmark_runner.py
python scripts/verify_examples.py
| Artifact | Use |
|---|---|
Risk Assessment TemplateDeployment Review TemplateRelease Checklistv1.0 Readiness| Demo | Shows | |---|---| |
Finance Research AgentHealthcare Agent Colony| Example | Shows | No API key | |---|---|---| |
02 Tool-Using Agent03 MCP-style Agent04 Memory Agent05 Multi-Agent Workflow06 Agent Colony07 Evaluation Harness08 Mini RAG09 Graph Approval Agent10 Observable Agent11 Prompt Injection Defense12 Cost-Aware Agent13 Durable Workflow Agent14 Modern MCP Gateway15 Memory Governance Agent16 Agent Permission System17 Advanced Eval HarnessCapstone StarterRun every dependency-free example with:
python scripts/verify_examples.py
This README uses lightweight visual widgets commonly seen in popular GitHub projects:
- Local cover image for the top hero banner
shields.io
for stars, forks, language, status, and topic badges- Mermaid for architecture diagrams
Agent Engineering is not only about prompts. A production agent needs a plugin ecosystem around it.
| Category | Purpose | Example Plugins / Tools |
|---|---|---|
| MCP Servers | Standardized access to tools and data | filesystem, database, browser, GitHub, Slack, Google Drive |
| Memory | Persistent context and retrieval | Qdrant, LanceDB, Chroma, PostgreSQL, Redis |
| Orchestration | Workflow and multi-agent control | LangGraph, CrewAI, AutoGen, OpenAI Agents SDK |
| RAG | Knowledge retrieval and grounding | LlamaIndex, LangChain, Haystack |
| Observability | Tracing, debugging, monitoring | Langfuse, OpenTelemetry, Helicone, Phoenix |
| Evaluation | Quality and safety testing | DeepEval, RAGAS, promptfoo, custom eval suites |
| Guardrails | Safety and structured validation | Guardrails AI, Pydantic, JSON Schema, policy checkers |
| UI / App Layer | User-facing agent applications | Streamlit, Gradio, Next.js, FastAPI |
| Domain Tools | Industry-specific integrations | healthcare records, finance data, CRM, ERP, ticketing systems |
graph TD
User[User] --> Supervisor[Supervisor Agent]
Supervisor --> Planner[Planner]
Planner --> MemoryAgent[Memory Agent]
Planner --> ResearchAgent[Research Agent]
Planner --> ToolAgent[Tool Agent]
Planner --> DomainAgent[Domain Agent]
MemoryAgent --> SharedMemory[Shared Memory]
ToolAgent --> MCP[MCP Servers]
DomainAgent --> MCP
ResearchAgent --> MCP
MCP --> PluginLayer[Plugin Ecosystem]
PluginLayer --> Databases[Databases]
PluginLayer --> Documents[Documents]
PluginLayer --> APIs[External APIs]
PluginLayer --> SaaS[SaaS Apps]
Supervisor --> Evaluator[Evaluator Agent]
Evaluator --> Final[Final Response]
Final --> User
Evaluator --> SharedMemory
agent-engineering-roadmap/
├── README.md
├── README_zh.md
├── COURSE.md
├── assets/ # Visual diagrams and teaching images
├── roadmap/ # Level 0-8 learning path
├── curriculum/ # Full course chapters
├── examples/ # Hands-on examples
├── benchmarks/ # Lightweight behavior checks
├── security/ # Prompt injection and agent security labs
├── study-groups/ # Cohort and workshop facilitation kit
├── showcases/ # Shareable demos with sample outputs
├── labs/ # Guided exercises
├── lesson-plans/ # Instructor-ready lesson plans
├── patterns/ # Architecture pattern catalog
├── architecture/ # System design patterns
├── templates/ # Reusable agent and MCP templates
├── assessments/ # Quiz bank and rubrics
├── projects/ # Capstone and portfolio projects
├── glossary/ # Agent engineering terms
├── healthcare/ # Healthcare agent engineering track
├── finance/ # Finance and quantitative research track
├── resources/ # Curated learning resources
├── docs/ # GitHub Pages site
└── launch-kit/ # Launch copy, topics, and checklist
Build agent systems for care management, nutrition tracking, personal health memory, and healthcare workflow automation.
Example colony:
Care Manager Agent
├── Nutrition Agent
├── Vital Sign Agent
├── Psychology Agent
├── Medication Agent
├── Memory Agent
└── Safety Evaluator Agent
Build research agents, factor-analysis agents, portfolio agents, risk agents, and trading research workflows.
Example colony:
Research Agent
├── Market Data Agent
├── Factor Analysis Agent
├── Portfolio Agent
├── Risk Agent
└── Report Agent
Build customer support agents, internal knowledge agents, document agents, workflow automation agents, and evaluation pipelines.
-
Agents should be useful before they are autonomous.
-
Memory should be intentional, auditable, and safe.
-
MCP should be treated as an integration layer, not just a plugin mechanism.
-
Multi-agent systems should reduce complexity for users, not create complexity for developers.
-
Production agents need evaluation, observability, cost control, and human approval gates.
-
Initialize bilingual repository structure
-
Add Level 0-8 roadmap skeleton
-
Add architecture documents
-
Add healthcare and finance tracks
-
Add README badges and hero banner
-
Expand each roadmap level into handbook chapters
-
Add minimal runnable examples
-
Add MCP server templates
-
Add memory system examples
-
Add agent colony demo
-
Add evaluation and safety templates
-
Add full course syllabus
-
Add observable agent and prompt injection defense examples
-
Add benchmark runner and study group kit
-
Add cost, durable runtime, and modern MCP gateway modules
-
Add memory governance, identity permission, and incident response modules
-
Add advanced eval, product UX, and enterprise operating model modules
-
Add guided labs
-
Add instructor-ready lesson plans
-
Add pattern catalog
-
Add quiz bank, rubrics, glossary, and capstone
-
Add full healthcare agent colony application
-
Add full finance research agent application
-
AI engineers
-
LLM application developers
-
Startup builders
-
Researchers building agent systems
-
Product teams moving from chatbot demos to real workflows
-
Developers interested in MCP, memory, and multi-agent systems
This project is licensed under the MIT License.