Agent Engineering Roadmap – a beginner-friendly guide to building AI agents

A new open-source repository, the Agent Engineering Roadmap, provides a structured, beginner-friendly guide for building production-ready AI agents, covering topics from single agents to multi-agent colonies and production safety. The roadmap treats agent development as an engineering discipline, emphasizing tools, memory, workflows, and evaluation over simple chatbot demos.

A hands-on roadmap for building production-ready AI Agents, MCP Servers, Memory Systems, Multi-Agent Workflows, and Agent Colonies. 繁體中文 /audi0417/agent-engineering-roadmap/blob/main/README zh.md · Website https://audi0417.github.io/agent-engineering-roadmap/ · Course /audi0417/agent-engineering-roadmap/blob/main/COURSE.md · Roadmap /audi0417/agent-engineering-roadmap/blob/main/roadmap/level-0-ai-llm-fundamentals.md · Examples /audi0417/agent-engineering-roadmap/blob/main/examples/01-single-agent/README.md · Showcases /audi0417/agent-engineering-roadmap/blob/main/showcases/README.md · Benchmarks /audi0417/agent-engineering-roadmap/blob/main/benchmarks/README.md · Labs /audi0417/agent-engineering-roadmap/blob/main/labs/README.md · Teaching /audi0417/agent-engineering-roadmap/blob/main/teaching/README zh.md · Templates /audi0417/agent-engineering-roadmap/blob/main/templates/README.md · Architecture /audi0417/agent-engineering-roadmap/blob/main/architecture/colony-architecture.md · Healthcare /audi0417/agent-engineering-roadmap/blob/main/healthcare/healthcare-agent-colony.md · Finance /audi0417/agent-engineering-roadmap/blob/main/finance/finance-agent-colony.md php flowchart LR User User -- Agent AI Agent Agent -- Tools Tool Use Tools -- MCP MCP Layer MCP -- Memory Memory System Memory -- Workflow Agent Workflow Workflow -- MultiAgent Multi-Agent Team MultiAgent -- Colony Agent Colony Colony -- Production Production AI App Most AI tutorials stop at prompts, RAG, or simple tool calling. Real agentic products require more than that: - agents that can use tools safely - MCP servers that connect agents to real systems - memory layers that persist useful context - workflows that are observable and controllable - multi-agent teams that can specialize and collaborate - evaluation, security, and production guardrails This repository is a practical learning path for builders who want to move from chatbot demos to real agent engineering. This roadmap teaches agents like an engineering course, not a tool catalog. Each major topic follows the same pattern: - Start with the problem: what breaks if you only use a chatbot? - Build the intuition: what is the simplest mental model? - Open the box: what components are actually involved? - Run a minimal example: what can you inspect locally? - Add production judgment: what needs evaluation, observability, approval, or safety gates? In one sentence: an agent is not magic. It is context, tools, memory, workflow, evaluation, and human judgment arranged around a useful task. | Level | Topic | Outcome | |---|---|---| | 0 | AI & LLM Fundamentals | Understand LLM apps, embeddings, RAG, and structured output | | 1 | Single Agent | Build a task-focused agent with a clear role and output format | | 2 | Tool Use | Connect agents to external tools and APIs | | 3 | MCP | Build and use MCP clients, servers, tools, resources, and prompts | | 4 | Agent Memory | Design short-term, episodic, semantic, user, and shared memory | | 5 | Agent Workflow | Build reliable planning, execution, review, retry, and approval flows | | 6 | Multi-Agent Systems | Coordinate specialized agents using supervisor, debate, and reflection patterns | | 7 | Agent Colony | Build shared-memory colonies with domain agents and evaluation loops | | 8 | Production & Safety | Deploy agents with observability, evaluation, security, and cost control | | Section | Purpose | |---|---| | Curriculum /audi0417/agent-engineering-roadmap/blob/main/curriculum/README.md Visual Assets /audi0417/agent-engineering-roadmap/blob/main/assets/README.md Roadmap /audi0417/agent-engineering-roadmap/blob/main/roadmap/level-0-ai-llm-fundamentals.md Examples /audi0417/agent-engineering-roadmap/blob/main/examples/01-single-agent/README.md Benchmarks /audi0417/agent-engineering-roadmap/blob/main/benchmarks/README.md Showcases /audi0417/agent-engineering-roadmap/blob/main/showcases/README.md Domain Casebooks /audi0417/agent-engineering-roadmap/blob/main/domain-casebooks/README.md Labs /audi0417/agent-engineering-roadmap/blob/main/labs/README.md Teaching Layer /audi0417/agent-engineering-roadmap/blob/main/teaching/README zh.md Lab Solution Guides /audi0417/agent-engineering-roadmap/blob/main/lab-solutions/README zh.md Lesson Plans /audi0417/agent-engineering-roadmap/blob/main/lesson-plans/README.md Study Group Kit /audi0417/agent-engineering-roadmap/blob/main/study-groups/README.md Patterns /audi0417/agent-engineering-roadmap/blob/main/patterns/README.md Templates /audi0417/agent-engineering-roadmap/blob/main/templates/README.md Papers /audi0417/agent-engineering-roadmap/blob/main/papers/README.md Open Source Projects /audi0417/agent-engineering-roadmap/blob/main/resources/open-source-agent-projects.md Framework Selection Matrix /audi0417/agent-engineering-roadmap/blob/main/resources/agent-framework-selection-matrix.md Open Source Reading Guide /audi0417/agent-engineering-roadmap/blob/main/resources/how-to-read-open-source-agent-repos.md DeepEval And RAGAS /audi0417/agent-engineering-roadmap/blob/main/resources/eval-frameworks-deepeval-ragas.md Release Checklist /audi0417/agent-engineering-roadmap/blob/main/release/RELEASE CHECKLIST.md Assessments /audi0417/agent-engineering-roadmap/blob/main/assessments/quiz-bank.md Capstone /audi0417/agent-engineering-roadmap/blob/main/projects/capstone-agent-colony.md Portfolio Projects /audi0417/agent-engineering-roadmap/blob/main/projects/portfolio-projects.md Capstone Starter /audi0417/agent-engineering-roadmap/blob/main/capstone-starter/README.md Glossary /audi0417/agent-engineering-roadmap/blob/main/glossary/agent-engineering-glossary.md AI Fundamentals ↓ Single Agent ↓ Tool Use ↓ MCP Integration ↓ Agent Memory ↓ Agent Workflow ↓ Multi-Agent Systems ↓ Agent Colony ↓ Production, Evaluation & Safety Run a showcase without API keys: python showcases/enterprise-support-agent/main.py python showcases/finance-research-agent/main.py python showcases/healthcare-agent-colony/main.py Then run the evaluation harness: python examples/07-evaluation-harness/main.py python examples/08-mini-rag/main.py python benchmarks/benchmark runner.py python scripts/verify examples.py | Artifact | Use | |---|---| | Risk Assessment Template /audi0417/agent-engineering-roadmap/blob/main/templates/risk-assessment-template.md Deployment Review Template /audi0417/agent-engineering-roadmap/blob/main/templates/deployment-review-template.md Release Checklist /audi0417/agent-engineering-roadmap/blob/main/release/RELEASE CHECKLIST.md v1.0 Readiness /audi0417/agent-engineering-roadmap/blob/main/release/V1 READINESS.md | Demo | Shows | |---|---| | Finance Research Agent /audi0417/agent-engineering-roadmap/blob/main/showcases/finance-research-agent/README.md Healthcare Agent Colony /audi0417/agent-engineering-roadmap/blob/main/showcases/healthcare-agent-colony/README.md | Example | Shows | No API key | |---|---|---| | 02 Tool-Using Agent /audi0417/agent-engineering-roadmap/blob/main/examples/02-tool-using-agent/README.md 03 MCP-style Agent /audi0417/agent-engineering-roadmap/blob/main/examples/03-mcp-agent/README.md 04 Memory Agent /audi0417/agent-engineering-roadmap/blob/main/examples/04-memory-agent/README.md 05 Multi-Agent Workflow /audi0417/agent-engineering-roadmap/blob/main/examples/05-multi-agent-workflow/README.md 06 Agent Colony /audi0417/agent-engineering-roadmap/blob/main/examples/06-agent-colony/README.md 07 Evaluation Harness /audi0417/agent-engineering-roadmap/blob/main/examples/07-evaluation-harness/README.md 08 Mini RAG /audi0417/agent-engineering-roadmap/blob/main/examples/08-mini-rag/README.md 09 Graph Approval Agent /audi0417/agent-engineering-roadmap/blob/main/examples/09-graph-approval-agent/README.md 10 Observable Agent /audi0417/agent-engineering-roadmap/blob/main/examples/10-observable-agent/README.md 11 Prompt Injection Defense /audi0417/agent-engineering-roadmap/blob/main/examples/11-prompt-injection-defense/README.md 12 Cost-Aware Agent /audi0417/agent-engineering-roadmap/blob/main/examples/12-cost-aware-agent/README.md 13 Durable Workflow Agent /audi0417/agent-engineering-roadmap/blob/main/examples/13-durable-workflow-agent/README.md 14 Modern MCP Gateway /audi0417/agent-engineering-roadmap/blob/main/examples/14-modern-mcp-gateway/README.md 15 Memory Governance Agent /audi0417/agent-engineering-roadmap/blob/main/examples/15-memory-governance-agent/README.md 16 Agent Permission System /audi0417/agent-engineering-roadmap/blob/main/examples/16-agent-permission-system/README.md 17 Advanced Eval Harness /audi0417/agent-engineering-roadmap/blob/main/examples/17-advanced-eval-harness/README.md Capstone Starter /audi0417/agent-engineering-roadmap/blob/main/capstone-starter/README.md Run every dependency-free example with: python scripts/verify examples.py This README uses lightweight visual widgets commonly seen in popular GitHub projects: - Local cover image for the top hero banner shields.io for stars, forks, language, status, and topic badges- Mermaid for architecture diagrams Agent Engineering is not only about prompts. A production agent needs a plugin ecosystem around it. | Category | Purpose | Example Plugins / Tools | |---|---|---| | MCP Servers | Standardized access to tools and data | filesystem, database, browser, GitHub, Slack, Google Drive | | Memory | Persistent context and retrieval | Qdrant, LanceDB, Chroma, PostgreSQL, Redis | | Orchestration | Workflow and multi-agent control | LangGraph, CrewAI, AutoGen, OpenAI Agents SDK | | RAG | Knowledge retrieval and grounding | LlamaIndex, LangChain, Haystack | | Observability | Tracing, debugging, monitoring | Langfuse, OpenTelemetry, Helicone, Phoenix | | Evaluation | Quality and safety testing | DeepEval, RAGAS, promptfoo, custom eval suites | | Guardrails | Safety and structured validation | Guardrails AI, Pydantic, JSON Schema, policy checkers | | UI / App Layer | User-facing agent applications | Streamlit, Gradio, Next.js, FastAPI | | Domain Tools | Industry-specific integrations | healthcare records, finance data, CRM, ERP, ticketing systems | php graph TD User User -- Supervisor Supervisor Agent Supervisor -- Planner Planner Planner -- MemoryAgent Memory Agent Planner -- ResearchAgent Research Agent Planner -- ToolAgent Tool Agent Planner -- DomainAgent Domain Agent MemoryAgent -- SharedMemory Shared Memory ToolAgent -- MCP MCP Servers DomainAgent -- MCP ResearchAgent -- MCP MCP -- PluginLayer Plugin Ecosystem PluginLayer -- Databases Databases PluginLayer -- Documents Documents PluginLayer -- APIs External APIs PluginLayer -- SaaS SaaS Apps Supervisor -- Evaluator Evaluator Agent Evaluator -- Final Final Response Final -- User Evaluator -- SharedMemory agent-engineering-roadmap/ ├── README.md ├── README zh.md ├── COURSE.md ├── assets/ Visual diagrams and teaching images ├── roadmap/ Level 0-8 learning path ├── curriculum/ Full course chapters ├── examples/ Hands-on examples ├── benchmarks/ Lightweight behavior checks ├── security/ Prompt injection and agent security labs ├── study-groups/ Cohort and workshop facilitation kit ├── showcases/ Shareable demos with sample outputs ├── labs/ Guided exercises ├── lesson-plans/ Instructor-ready lesson plans ├── patterns/ Architecture pattern catalog ├── architecture/ System design patterns ├── templates/ Reusable agent and MCP templates ├── assessments/ Quiz bank and rubrics ├── projects/ Capstone and portfolio projects ├── glossary/ Agent engineering terms ├── healthcare/ Healthcare agent engineering track ├── finance/ Finance and quantitative research track ├── resources/ Curated learning resources ├── docs/ GitHub Pages site └── launch-kit/ Launch copy, topics, and checklist Build agent systems for care management, nutrition tracking, personal health memory, and healthcare workflow automation. Example colony: Care Manager Agent ├── Nutrition Agent ├── Vital Sign Agent ├── Psychology Agent ├── Medication Agent ├── Memory Agent └── Safety Evaluator Agent Build research agents, factor-analysis agents, portfolio agents, risk agents, and trading research workflows. Example colony: Research Agent ├── Market Data Agent ├── Factor Analysis Agent ├── Portfolio Agent ├── Risk Agent └── Report Agent Build customer support agents, internal knowledge agents, document agents, workflow automation agents, and evaluation pipelines. - Agents should be useful before they are autonomous. - Memory should be intentional, auditable, and safe. - MCP should be treated as an integration layer, not just a plugin mechanism. - Multi-agent systems should reduce complexity for users, not create complexity for developers. - Production agents need evaluation, observability, cost control, and human approval gates. - Initialize bilingual repository structure - Add Level 0-8 roadmap skeleton - Add architecture documents - Add healthcare and finance tracks - Add README badges and hero banner - Expand each roadmap level into handbook chapters - Add minimal runnable examples - Add MCP server templates - Add memory system examples - Add agent colony demo - Add evaluation and safety templates - Add full course syllabus - Add observable agent and prompt injection defense examples - Add benchmark runner and study group kit - Add cost, durable runtime, and modern MCP gateway modules - Add memory governance, identity permission, and incident response modules - Add advanced eval, product UX, and enterprise operating model modules - Add guided labs - Add instructor-ready lesson plans - Add pattern catalog - Add quiz bank, rubrics, glossary, and capstone - Add full healthcare agent colony application - Add full finance research agent application - AI engineers - LLM application developers - Startup builders - Researchers building agent systems - Product teams moving from chatbot demos to real workflows - Developers interested in MCP, memory, and multi-agent systems This project is licensed under the MIT License /audi0417/agent-engineering-roadmap/blob/main/LICENSE .