{"slug": "agentic-engineering-handbook", "title": "Agentic-Engineering-Handbook", "summary": "OpenAI and Anthropic released a series of new tools and standards for building AI agents in 2025, including OpenAI's Responses API, Agents SDK, AgentKit, and Codex, as well as Anthropic's Model Context Protocol (MCP) for connecting AI assistants to external data and tools. These releases mark a significant push toward production-ready agent frameworks, with OpenAI integrating MCP into its runtime and Anthropic establishing an open standard for AI-tool interoperability.", "body_md": "| P0 |\n[OpenAI for Developers in 2025](https://developers.openai.com/blog/openai-for-developers-2025) |\nOpenAI |\nAgents; MCP; Platform |\nAnnual overview: systematic walkthrough of Responses API, Agents SDK, AgentKit, Codex, MCP, Apps SDK, and AGENTS.md. |\n2025-12-30 |\n| P0 |\n[New tools for building agents](https://openai.com/index/new-tools-for-building-agents/) |\nOpenAI |\nAgents; Responses API; Tools |\nKey starting point for OpenAI's agent platform: Responses API, built-in web/file/computer tools, Agents SDK, tracing/observability. |\n2025-03-11 |\n| P0 |\n[Introducing AgentKit](https://openai.com/index/introducing-agentkit/) |\nOpenAI |\nAgents; Evals; AgentKit |\nAgentKit, expanded evals, agent RFT: the official agent toolchain from prototype to production. |\n2025-10-06 |\n| P0 |\n[Agents SDK overview](https://developers.openai.com/api/docs/guides/agents) |\nOpenAI |\nAgents; SDK |\nOfficial SDK entry point: concepts and boundaries of agent, tool, handoff, guardrail, and tracing. |\nCurrent docs |\n| P0 |\n[Orchestrating Agents: Routines and Handoffs](https://developers.openai.com/cookbook/examples/orchestrating_agents) |\nOpenAI |\nAgents; Handoffs; Orchestration |\nClassic introduction: how routines, handoffs, and tool calling combine into controllable multi-flow agents. |\n2024-10-10 |\n| P0 |\n[Introducing the Model Context Protocol](https://www.anthropic.com/news/model-context-protocol) |\nAnthropic |\nMCP; Standards |\nThe origin article for MCP: an open standard connecting AI assistants to data, tools, and systems. |\n2024-11-25 |\n| P0 |\n[Building effective agents](https://www.anthropic.com/engineering/building-effective-agents) |\nAnthropic |\nAgents; Patterns; Frameworks |\nEssential agent primer: workflow vs agent, prompt/tool/retrieval, orchestrator-worker, evaluator-optimizer patterns. |\n2024-12-19 |\n| P0 |\n[New tools and features in the Responses API](https://openai.com/index/new-tools-and-features-in-the-responses-api/) |\nOpenAI |\nMCP; Responses API; Tools |\nResponses API extended to remote MCP servers, image/code/file tools; see how OpenAI integrates MCP into its runtime. |\n2025-05-21 |\n| P0 |\n[MCP and Connectors](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) |\nOpenAI |\nMCP; Connectors; Responses API |\nOfficial guide to connecting remote MCP servers and connectors; includes approvals and security considerations. |\nCurrent docs |\n| P0 |\n[Building MCP servers for ChatGPT Apps and API integrations](https://developers.openai.com/api/docs/mcp) |\nOpenAI |\nMCP; ChatGPT Apps; API |\nOfficial guide to writing MCP servers: supply tools/knowledge to ChatGPT Apps, deep research, and API integrations. |\nCurrent docs |\n| P0 |\n[Building a Deep Research MCP Server](https://developers.openai.com/cookbook/examples/deep_research_api/how_to_build_a_deep_research_mcp_server/readme) |\nOpenAI |\nMCP; Deep research |\nMinimal implementation of a search/fetch MCP server for Deep Research. |\n2025-06-25 |\n| P0 |\n[Model Context Protocol - Codex](https://developers.openai.com/codex/mcp) |\nOpenAI |\nMCP; Codex |\nHow Codex CLI/IDE connects to MCP servers, adding Figma, browser, docs, and internal tool context to agents. |\nCurrent docs |\n| P0 |\n[Introducing Codex](https://openai.com/index/introducing-codex/) |\nOpenAI |\nAgents; Coding; Sandbox |\nCloud-based software engineering agent: parallel tasks, repo sandbox, running tests/linters/type checkers, producing auditable evidence. |\n2025-05-16 |\n| P0 |\n[Unrolling the Codex agent loop](https://openai.com/index/unrolling-the-codex-agent-loop/) |\nOpenAI |\nHarness; Agent loop; Codex |\nHow Codex CLI chains prompt, tool schema, MCP tools, Responses API, and context management into an agent loop. |\n2026-01-23 |\n| P0 |\n[Unlocking the Codex harness: how we built the App Server](https://openai.com/index/unlocking-the-codex-harness/) |\nOpenAI |\nHarness; Codex App Server; JSON-RPC |\nCore harness article: Codex core, App Server, JSON-RPC, streaming progress, approval, diff, and thread management. |\n2026-02-04 |\n| P0 |\n[From model to agent: Equipping the Responses API with a computer environment](https://openai.com/index/equip-responses-api-computer-environment/) |\nOpenAI |\nHarness; Responses API; Sandbox |\nResponses API + shell tool + hosted containers form the agent runtime; essential for understanding the model-to-agent execution environment. |\n2026-03-10 |\n| P0 |\n[Harness engineering: leveraging Codex in an agent-first world](https://openai.com/index/harness-engineering/) |\nOpenAI |\nHarness; Agent-first engineering |\nDesign product code, tests, CI, docs, and observability to be agent-readable/executable; learn agent-first repo organization. |\n2026-02-11 |\n| P0 |\n[The next evolution of the Agents SDK](https://openai.com/index/the-next-evolution-of-the-agents-sdk/) |\nOpenAI |\nHarness; Agents SDK; MCP; Skills |\nAgents SDK harness becomes more complete: memory, sandbox orchestration, Codex-like filesystem tools, MCP, skills, AGENTS.md. |\n2026-04-15 |\n| P0 |\n[Building Consistent Workflows with Codex CLI & Agents SDK](https://developers.openai.com/cookbook/examples/codex/codex_mcp_agents_sdk/building_consistent_workflows_codex_cli_agents_sdk) |\nOpenAI |\nMCP; Codex; Agents SDK |\nCodex CLI as an MCP server integrated with Agents SDK; real multi-agent dev workflow. |\n2025-10-01 |\n| P0 |\n[Building Reliable Agents with Memory and Compaction](https://developers.openai.com/cookbook/examples/agents_sdk/building_reliable_agents_memory_compaction) |\nOpenAI |\nMemory; Compaction; Reliability |\nMemory and compaction design for long-context/multi-turn agents. |\n2026-05-01 |\n| P0 |\n[Build an Agent Improvement Loop with Traces, Evals, and Codex](https://developers.openai.com/cookbook/examples/agents_sdk/agent_improvement_loop) |\nOpenAI |\nEvals; Traces; Self-improvement |\nConnect traces, evals, and Codex fixes into an agent improvement loop. |\n2026-05-12 |\n| P0 |\n[Eval Driven System Design - From Prototype to Production](https://developers.openai.com/cookbook/topic/evals) |\nOpenAI |\nEvals; Production |\nUse evals as the driving force for system design; ideal for moving agents from demo to production. |\n2025-06-02 |\n| P0 |\n[Testing Agent Skills Systematically with Evals](https://developers.openai.com/blog/eval-skills) |\nOpenAI |\nEvals; Skills; Agents |\nSystematically test agent skills with evals; establish quality gates before skill release. |\n2026-01-22 |\n| P0 |\n[Evals API Use-case - MCP Evaluation](https://developers.openai.com/cookbook/examples/evaluation/use-cases/mcp_eval_notebook) |\nOpenAI |\nMCP; Evals |\nEvaluate QA/retrieval capabilities with MCP tools; ideal for building an MCP regression suite. |\n2025-06-09 |\n| P0 |\n[Running Codex safely at OpenAI](https://openai.com/index/running-codex-safely/) |\nOpenAI |\nSafety; Sandbox; Codex |\nHow OpenAI runs Codex internally: sandbox, approvals, network policy, agent-native telemetry. |\n2026-05-20 |\n| P0 |\n[Building Governed AI Agents - A Practical Guide to Agentic Scaffolding](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nGovernance; Guardrails; Agents |\nGoverned agent scaffolding: permissions, guardrails, auditing, and organizational policies. |\n2026-02-23 |\n| P0 |\n[Macro Evals for Agentic Systems](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nEvals; Agentic systems |\nEvaluate agents at the end-to-end/macro level, not just individual step outputs. |\n2026-05-19 |\n| P0 |\n[Best practices for Claude Code](https://www.anthropic.com/engineering/claude-code-best-practices) |\nAnthropic |\nCoding agents; Claude Code |\nClaude Code methodology: verification loop, explore-plan-code, CLAUDE.md, permissions, MCP, subagents, context management. |\n2025-04-18 |\n| P0 |\n[How we built our multi-agent research system](https://www.anthropic.com/engineering/multi-agent-research-system) |\nAnthropic |\nAgents; Multi-agent; Research |\nClaude Research multi-agent architecture: planner + parallel research agents + synthesis; production multi-agent experience. |\n2025-06-13 |\n| P0 |\n[Writing effective tools for AI agents - with AI agents](https://www.anthropic.com/engineering/writing-tools-for-agents) |\nAnthropic |\nTools; MCP; Evals |\nTool quality determines agent quality: tool descriptions, context budget, eval, and letting Claude optimize its own tools. |\n2025-09-11 |\n| P0 |\n[Effective context engineering for AI agents](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents) |\nAnthropic |\nContext; Agents |\nContext is the agent's core resource: selection, compression, isolation, persistence, and context pollution control. |\n2025-09-29 |\n| P0 |\n[Enabling Claude Code to work more autonomously](https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously) |\nAnthropic |\nClaude Code; Agent SDK; Subagents |\nClaude Agent SDK, subagents, hooks, background tasks, checkpoints, and other autonomous coding agent capabilities. |\n2025-09-29 |\n| P0 |\n[Equipping agents for the real world with Agent Skills](https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills) |\nAnthropic |\nSkills; Agents |\nAgent Skills as modular capability packages: instructions, resources, scripts — reducing context burden and improving reliability. |\n2025-10-16 |\n| P0 |\n[Code execution with MCP: Building more efficient agents](https://www.anthropic.com/engineering/code-execution-with-mcp) |\nAnthropic |\nMCP; Code execution; Context |\nKey article on MCP scale challenges: reduce token overhead with code execution/on-demand tools; learn progressive disclosure. |\n2025-11-04 |\n| P0 |\n[Introducing advanced tool use on Claude Developer Platform](https://www.anthropic.com/engineering/advanced-tool-use) |\nAnthropic |\nTools; MCP; Advanced tool use |\nTool search, deferred loading, programmatic tool calling; solving context pollution from large numbers of MCP tools. |\n2025-11-24 |\n| P0 |\n[Effective harnesses for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents) |\nAnthropic |\nHarness; Long-running agents |\nEssential harness reading: working across multiple context windows, task logging, external state, agent self-recovery. |\n2025-11-26 |\n| P0 |\n[Demystifying evals for AI agents](https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents) |\nAnthropic |\nEvals; Agents |\nAgent evals are more complex than static evals: multi-turn, tools, state changes, creative solutions, failure taxonomy. |\n2026-01-09 |\n| P0 |\n[Measuring AI agent autonomy in practice](https://www.anthropic.com/news/measuring-agent-autonomy) |\nAnthropic |\nAgents; Autonomy; Measurement |\nQuantify agent autonomy using metrics like task duration and supervision needs; ideal for building autonomy benchmarks. |\n2026-02-18 |\n| P0 |\n[Harness design for long-running application development](https://www.anthropic.com/engineering/harness-design-long-running-apps) |\nAnthropic |\nHarness; Application development |\nHarness design patterns for delegating long-running app development tasks to agents; compare with OpenAI Codex harness. |\n2026-03-24 |\n| P0 |\n[Scaling Managed Agents: Decoupling the brain from the hands](https://www.anthropic.com/engineering/managed-agents) |\nAnthropic |\nManaged agents; Harness |\nDecouple the model brain from execution hands/harness, keeping interfaces stable as the harness evolves. |\n2026-04-08 |\n| P0 |\n[How we contain Claude across products](https://www.anthropic.com/engineering/how-we-contain-claude) |\nAnthropic |\nSafety; Containment; Agents |\nBlast radius of powerful agent releases, human-in-the-loop, and containment strategies. |\n2026-05-25 |\n| P1 |\n[Structured Outputs for Multi-Agent Systems](https://developers.openai.com/cookbook/examples/structured_outputs_multi_agent) |\nOpenAI |\nAgents; Multi-agent; Structured outputs |\nUse strict schemas to constrain structured messages and handoffs between multiple agents. |\n2024-08-06 |\n| P1 |\n[Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku](https://www.anthropic.com/news/3-5-models-and-computer-use) |\nAnthropic |\nAgents; Computer use |\nClaude computer use beta starting point: the model uses a computer via screenshots and actions. |\n2024-10-22 |\n| P1 |\n[Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet](https://www.anthropic.com/engineering/swe-bench-sonnet) |\nAnthropic |\nAgents; Coding; Evals |\nSWE-bench agent scaffolding article: same model performance strongly depends on harness/scaffolding. |\n2025-01-06 |\n| P1 |\n[Introducing Operator](https://openai.com/index/introducing-operator/) |\nOpenAI |\nAgents; Computer use; Safety |\nEarly product form of browser-based agents: model clicks, types, and executes tasks on web pages, emphasizing user confirmation and safety boundaries. |\n2025-01-23 |\n| P1 |\n[Computer-Using Agent](https://openai.com/index/computer-using-agent/) |\nOpenAI |\nAgents; Computer use |\nUnderstand how CUA combines vision, mouse/keyboard actions, and environment feedback into an agent loop; compare with Claude computer use. |\n2025-01-23 |\n| P1 |\n[Claude 3.7 Sonnet and Claude Code](https://www.anthropic.com/news/claude-3-7-sonnet) |\nAnthropic |\nAgents; Coding; Claude Code |\nEarly release of Claude Code, marking Claude's entry into the agentic coding tool space. |\n2025-02-24 |\n| P1 |\n[The think tool: Enabling Claude to stop and think in complex tool use situations](https://www.anthropic.com/engineering/claude-think-tool) |\nAnthropic |\nTools; Reasoning; Agents |\nGive the model an explicit think tool in complex tool-use chains; learn tool design for policy-heavy/multi-step decisions. |\n2025-03-20 |\n| P1 |\n[Evaluating Agents with Langfuse](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nEvals; Agents |\nObserve and evaluate Agents SDK runs with Langfuse; learn tracing/eval workflows. |\n2025-03-31 |\n| P1 |\n[Parallel Agents with the OpenAI Agents SDK](https://developers.openai.com/cookbook/examples/agents_sdk/parallel_agents) |\nOpenAI |\nAgents; Parallelism; Agents SDK |\nParallel agent patterns: decompose tasks, execute in parallel, aggregate results. |\n2025-05-01 |\n| P1 |\n[Multi-Agent Portfolio Collaboration with OpenAI Agents SDK](https://developers.openai.com/cookbook/examples/agents_sdk/multi-agent-portfolio-collaboration/multi_agent_portfolio_collaboration) |\nOpenAI |\nAgents; Multi-agent; Portfolio |\nMulti-agent collaboration business example: research, analysis, combined output. |\n2025-05-28 |\n| P1 |\n[MCP-Powered Agentic Voice Framework](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nMCP; Voice; Agents |\nVoice agent + MCP paradigm: real-time interaction, tool extension, task execution. |\n2025-06-17 |\n| P1 |\n[Deep Research API with the Agents SDK](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nAgents; Deep research; Agents SDK |\nIntegrate Deep Research API into Agents SDK workflows. |\n2025-06-25 |\n| P1 |\n[Desktop Extensions: One-click MCP server installation for Claude Desktop](https://www.anthropic.com/engineering/desktop-extensions) |\nAnthropic |\nMCP; Claude Desktop; Packaging |\nPackage local MCP servers as one-click install extensions; learn MCP distribution/installation/local permission issues. |\n2025-06-26 |\n| P1 |\n[Building a Supply-Chain Copilot with OpenAI Agent SDK and Databricks MCP Servers](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nMCP; Agents; Databricks |\nEnterprise data platform MCP + Agent SDK business agent example. |\n2025-07-08 |\n| P1 |\n[Introducing ChatGPT agent: bridging research and action](https://openai.com/index/introducing-chatgpt-agent/) |\nOpenAI |\nAgents; ChatGPT; Computer use |\nEnd-user-facing ChatGPT agent: combining research, browser, computer use, file/slide capabilities. |\n2025-07-17 |\n| P1 |\n[ChatGPT agent System Card](https://openai.com/index/chatgpt-agent-system-card/) |\nOpenAI |\nAgents; Safety; Evals |\nLearn pre-launch risk classification, evaluation, permissions, human confirmation, and abuse prevention for agent products. |\n2025-07-17 |\n| P1 |\n[Context Engineering - Short-Term Memory Management with Sessions](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nContext; Sessions; Agents |\nHow short-term memory/session state affects agent reliability. |\n2025-09-09 |\n| P1 |\n[Introducing upgrades to Codex](https://openai.com/index/introducing-upgrades-to-codex/) |\nOpenAI |\nAgents; Coding; IDE |\nCodex evolves from research preview to daily dev tool: CLI, IDE, web/mobile collaboration, and more independent task execution. |\n2025-09-15 |\n| P1 |\n[Introducing Claude Sonnet 4.5](https://www.anthropic.com/news/claude-sonnet-4-5) |\nAnthropic |\nAgents; Claude Agent SDK; Computer use |\nSonnet 4.5 emphasizes coding, complex agents, computer use, with simultaneous Agent SDK launch. |\n2025-09-29 |\n| P1 |\n[Introducing apps in ChatGPT and the new Apps SDK](https://openai.com/index/introducing-apps-in-chatgpt/) |\nOpenAI |\nMCP; Apps; ChatGPT |\nApps SDK extends UI and tool server via MCP; entry point for understanding the ChatGPT app / MCP app ecosystem. |\n2025-10-06 |\n| P1 |\n[Codex is now generally available](https://openai.com/index/codex-now-generally-available/) |\nOpenAI |\nAgents; Coding; Codex SDK |\nCodex GA, Slack integration, Codex SDK, admin tools; see how coding agents enter enterprise management. |\n2025-10-06 |\n| P1 |\n[Using PLANS.md for multi-hour problem solving](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nCodex; Long-running; Planning |\nPlan files and cross-context task management for long-running coding agents. |\n2025-10-07 |\n| P1 |\n[Beyond permission prompts: making Claude Code more secure and autonomous](https://www.anthropic.com/engineering/beyond-permission-prompts) |\nAnthropic |\nSafety; Permissions; Claude Code |\nFrom simple permission prompts to fine-grained security policies, reducing autonomous mode risk and interruptions. |\n2025-10-20 |\n| P1 |\n[Introducing Aardvark: OpenAI's agentic security researcher](https://openai.com/index/introducing-aardvark/) |\nOpenAI |\nAgents; Security |\nSecurity-domain agent form: continuous scanning, issue verification, fix suggestions; later integrated as Codex Security. |\n2025-10-30 |\n| P1 |\n[Build a coding agent with GPT 5.1](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nAgents; Coding |\nBuild a coding agent from scratch: understand file editing, command execution, loops, and verification. |\n2025-11-13 |\n| P1 |\n[OpenAI co-founds Agentic AI Foundation](https://openai.com/index/agentic-ai-foundation/) |\nOpenAI |\nMCP; Standards; AGENTS.md |\nMCP, AGENTS.md, and agent standards enter the Linux Foundation/AAIF context; understand ecosystem standardization. |\n2025-12-09 |\n| P1 |\n[Donating MCP and establishing the Agentic AI Foundation](https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation) |\nAnthropic |\nMCP; Standards; AAIF |\nAnthropic donates MCP to Linux Foundation/AAIF; read alongside OpenAI's AAIF article. |\n2025-12-09 |\n| P1 |\n[Context Engineering for Personalization - Long-Term Memory Notes](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nContext; Long-term memory; Agents |\nHow long-term memory serves as agent personalization/state management. |\n2026-01-05 |\n| P1 |\n[Supercharging Codex with JetBrains MCP at Skyscanner](https://developers.openai.com/blog/skyscanner-codex-jetbrains-mcp) |\nOpenAI |\nMCP; Codex; IDE |\nReal IDE/MCP case study: how Codex CLI accesses IDE context and dev tools via JetBrains MCP. |\n2026-01-11 |\n| P1 |\n[Designing AI-resistant technical evaluations](https://www.anthropic.com/engineering/AI-resistant-technical-evaluations) |\nAnthropic |\nEvals; Technical hiring |\nHow strong agents continuously break technical evaluations; relevant to benchmark contamination prevention and eval design. |\n2026-01-21 |\n| P1 |\n[Inside OpenAI's in-house data agent](https://openai.com/index/inside-our-in-house-data-agent/) |\nOpenAI |\nAgents; Data; Memory |\nInternal data agent case study: memory, Codex, data context, reliability; learn enterprise knowledge/data agents. |\n2026-01-29 |\n| P1 |\n[Introducing the Codex app](https://openai.com/index/introducing-the-codex-app/) |\nOpenAI |\nAgents; Coding; Multi-agent |\nDesktop command center for agents: multi-threaded/parallel long tasks, project-level agent workflows. |\n2026-02-02 |\n| P1 |\n[Apple's Xcode now supports Claude Agent SDK](https://www.anthropic.com/news/apple-xcode-claude-agent-sdk) |\nAnthropic |\nClaude Agent SDK; Xcode; MCP |\nEmbed Claude Agent SDK in Xcode: harness, subagents, background tasks, plugins, MCP. |\n2026-02-03 |\n| P1 |\n[Quantifying infrastructure noise in agentic coding evals](https://www.anthropic.com/engineering/infrastructure-noise) |\nAnthropic |\nEvals; Coding agents; Infrastructure |\nEnvironment configuration significantly impacts scores in agentic coding evals; control infrastructure noise in both production and benchmarks. |\n2026-02-05 |\n| P1 |\n[Building a C compiler with a team of parallel Claudes](https://www.anthropic.com/engineering/building-c-compiler) |\nAnthropic |\nMulti-agent; Coding; Long-running |\nParallel Claude teams completing large engineering tasks; learn multi-agent division of labor, coordination, and long-running execution. |\n2026-02-05 |\n| P1 |\n[Codex Security: now in research preview](https://openai.com/index/codex-security-now-in-research-preview/) |\nOpenAI |\nAgents; Security; Codex |\nProductization of an agentic security researcher: vulnerability discovery, verification, fix suggestions, reducing triage noise. |\n2026-03-06 |\n| P1 |\n[Eval awareness in Claude Opus 4.6's BrowseComp performance](https://www.anthropic.com/engineering) |\nAnthropic |\nEvals; Agent awareness |\nRisk of models recognizing/adapting to evaluations; relevant to agent benchmark credibility discussions. |\n2026-03-06 |\n| P1 |\n[How we built Claude Code auto mode: a safer way to skip permissions](https://www.anthropic.com/engineering/claude-code-auto-mode) |\nAnthropic |\nSafety; Permissions; Autonomy |\nClaude Code auto mode risk classification, allow/block rules, exception handling, and security testing. |\n2026-03-25 |\n| P1 |\n[Migrate a Legacy Codebase with Sandbox Agents](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nAgents; Sandbox; Evals |\nSandbox agent evaluation and execution patterns in large legacy code migrations. |\n2026-04-07 |\n| P1 |\n[Codex for (almost) everything](https://openai.com/index/codex-for-almost-everything/) |\nOpenAI |\nAgents; Codex; MCP; Plugins |\nCodex app expanded to Windows/macOS, computer use, in-app browser, memory, plugins, MCP servers. |\n2026-04-16 |\n| P1 |\n[Computer Use Agents in Daytona Sandboxes](https://developers.openai.com/cookbook/examples/agents_sdk/computer_use_with_daytona/computer_use_with_daytona) |\nOpenAI |\nComputer use; Sandbox; Agents |\nComputer-use agents and sandbox runtimes; compare with Operator/CUA/Claude computer use. |\n2026-04-19 |\n| P1 |\n[Introducing workspace agents in ChatGPT](https://openai.com/index/introducing-workspace-agents-in-chatgpt/) |\nOpenAI |\nAgents; Workspace; Governance |\nWorkspace agents: shared agents, permissions, tools, memory, safeguards; ideal for team collaboration agent design. |\n2026-04-22 |\n| P1 |\n[Building workspace agents in ChatGPT to complete repeatable, end-to-end work](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nWorkspace agents; ChatGPT |\nPractical workspace agents for repeatable end-to-end team workflows. |\n2026-04-22 |\n| P1 |\n[Speeding up agentic workflows with WebSockets in the Responses API](https://openai.com/index/speeding-up-agentic-workflows-with-websockets/) |\nOpenAI |\nAgents; Latency; Responses API |\nOptimize latency by treating agentic rollouts as long-lived connections/tasks; learn production agent transport and caching. |\n2026-05-01 |\n| P1 |\n[Agents for financial services](https://www.anthropic.com/news/finance-agents) |\nAnthropic |\nAgents; Finance; MCP |\nTen ready-to-run agent templates, Claude Code/Cowork plugins, Managed Agents cookbooks, MCP app. |\n2026-05-05 |\n| P1 |\n[Migrate from the Claude Agent SDK to the OpenAI Agents SDK](https://developers.openai.com/cookbook/examples/agents_sdk/migrate-from-claude-agent-sdk/readme) |\nOpenAI |\nAgents SDK; Migration |\nCompare Claude Agent SDK and OpenAI Agents SDK from a migration perspective; ideal for dual-stack learning. |\n2026-05-07 |\n| P1 |\n[Building a safe, effective sandbox to enable Codex on Windows](https://openai.com/index/building-codex-windows-sandbox/) |\nOpenAI |\nSafety; Sandbox; Codex |\nCoding agent sandbox design on Windows: file access, network restrictions, approval tradeoffs. |\n2026-05-13 |\n| P1 |\n[Building self-improving tax agents with Codex](https://openai.com/index/building-self-improving-tax-agents-with-codex/) |\nOpenAI |\nAgents; Evals; Self-improvement |\nCombine production traces, expert feedback, Codex loop, and eval infrastructure into self-improving business agents. |\n2026-05-27 |\n| P1 |\n[SchemaFlow: Agentic Database Change Impact Analysis, SQL Generation, and Eval Guardrails](https://developers.openai.com/cookbook/topic/agents) |\nOpenAI |\nEvals; SQL; Agent guardrails |\nGuardrails and eval guardrails examples for data/SQL agents. |\n2026-06-05 |\n| P1 |\n[Agents SDK quickstart](https://developers.openai.com/api/docs/guides/agents/quickstart) |\nOpenAI |\nAgents; SDK |\nQuickly build a minimal agent; understand the code patterns of run, tool, and handoff. |\nCurrent docs |\n| P1 |\n[MCP Apps compatibility in ChatGPT](https://developers.openai.com/apps-sdk/mcp-apps-in-chatgpt) |\nOpenAI |\nMCP; Apps SDK; UI |\nUnderstand MCP Apps UI standards, iframe/bridge, and compatibility between ChatGPT and other hosts. |\nCurrent docs |\n| P1 |\n[Use Codex with the Agents SDK](https://developers.openai.com/codex/guides/agents-sdk) |\nOpenAI |\nMCP; Codex; Agents SDK |\nUse Codex as an MCP server for other agents to call; ideal for multi-agent dev workflows. |\nCurrent docs |\n| P1 |\n[Agent approvals and security - Codex](https://developers.openai.com/codex/agent-approvals-security) |\nOpenAI |\nSafety; Approvals; Codex |\nOfficial reference for Codex approval modes, sandbox, network access; read alongside OpenAI/Anthropic safety articles. |\nCurrent docs |\n| P1 |\n[Agent Skills - Codex](https://developers.openai.com/codex/skills) |\nOpenAI |\nCodex; Skills; Plugins |\nSkills/Plugins as reusable workflow packages; compare with Anthropic Agent Skills. |\nCurrent docs |\n| P1 |\n[Custom instructions with AGENTS.md - Codex](https://developers.openai.com/codex/guides/agents-md) |\nOpenAI |\nAGENTS.md; Context |\nHow AGENTS.md provides persistent project specifications for agents; establish repo-level agent contracts. |\nCurrent docs |\n| P1 |\n[Agents SDK integrations and observability](https://developers.openai.com/api/docs/guides/agents/integrations-observability) |\nOpenAI |\nObservability; MCP; Tracing |\nTracing, MCP integration, provider/observability; essential for production agent debugging. |\nCurrent docs |\n| P1 |\n[Secure MCP Tunnel](https://developers.openai.com/api/docs/guides/secure-mcp-tunnels) |\nOpenAI |\nMCP; Security; Private tools |\nSecurely expose private/intranet MCP servers to supported OpenAI surfaces; ideal for enterprise deployment. |\nCurrent docs |\n| P1 |\n[How Claude Code works](https://code.claude.com/docs/en/how-claude-code-works) |\nAnthropic |\nClaude Code; Agentic loop; Harness |\nUnder-the-hood architecture of Claude Code: the agentic loop (gather context → act → verify), built-in tool categories, context window management, and extension points. |\nCurrent docs |\n| P0 |\n[learn-claude-code](https://github.com/shareAI-lab/learn-claude-code) |\nCommunity |\nHarness; Agent loop; Tools; Context |\nHands-on 20-lesson tutorial building a Claude Code–like agent harness from scratch: agent loop, tool integration, context compaction, multi-agent coordination, permissions, MCP plugins. |\n2026 |\n| P0 |\n[Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems](https://arxiv.org/abs/2604.14228) |\nAcademic |\nAgent architecture; Claude Code; Design space |\nDeep technical analysis of Claude Code's architecture: agentic loop, permission system, context compaction, extensibility (MCP/plugins/skills/hooks), subagent delegation, and comparison with open-source alternatives. |\n2026-04-14 |\n| P0 |\n[Function Calling](https://developers.openai.com/api/docs/guides/function-calling) |\nOpenAI |\nTools; Function calling; API |\nOfficial guide to function/tool calling: define functions with JSON schemas, handle model tool calls, execute and return results. |\nCurrent docs |\n| P0 |\n[Tool use overview](https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview) |\nAnthropic |\nTools; Tool use; API |\nConnect Claude to external tools and APIs: client vs server tools, the agentic loop, strict schema conformance, and when Claude decides to call tools. |\nCurrent docs |\n| P0 |\n[Function calling - Gemini API](https://ai.google.dev/gemini-api/docs/function-calling) |\nGoogle |\nTools; Function calling; API |\nEnable Gemini models to connect with external tools via function calling: single-turn, multi-turn, parallel, and sequential function chains. |\nCurrent docs |\n| P2 |\n[Introducing Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval) |\nAnthropic |\nContext; Retrieval; RAG |\nNot agent-specific, but important for agent RAG/context: prepend context to chunks before retrieval to improve recall. |\n2024-09-19 |\n| P2 |\n[Developing a computer use model](https://www.anthropic.com/news/developing-computer-use) |\nAnthropic |\nComputer use; Agents |\nMore technical explanation of how the computer-use model moves the mouse, clicks, types, and reads screen feedback. |\n2024-10-22 |\n| P2 |\n[Introducing Claude 4](https://www.anthropic.com/news/claude-4) |\nAnthropic |\nAgents; Coding; Long-running |\nOverview of Claude Opus/Sonnet 4 capabilities: coding, advanced reasoning, agent workflows. |\n2025-05-22 |\n| P2 |\n[Claude for Financial Services](https://www.anthropic.com/news/claude-for-financial-services) |\nAnthropic |\nAgents; Connectors; Finance |\nVertical industry agent/connector productization case; understand data, permissions, and tool integration in finance. |\n2025-07-15 |\n| P2 |\n[Advancing Claude for Financial Services](https://www.anthropic.com/news/advancing-claude-for-financial-services) |\nAnthropic |\nAgents; Skills; Finance |\nClaude for Excel, real-time data connectors, pre-built Agent Skills for vertical industry productization. |\n2025-10-27 |\n| P2 |\n[Introducing GPT-5.3-Codex](https://openai.com/index/introducing-gpt-5-3-codex/) |\nOpenAI |\nAgents; Coding model; Evals |\nCodex-native model and long-running coding/terminal/agentic benchmarks; understand how model capabilities serve the harness. |\n2026-02-05 |\n| P2 |\n[Introducing OpenAI Frontier](https://openai.com/index/introducing-openai-frontier/) |\nOpenAI |\nAgents; Enterprise; Governance |\nEnterprise AI coworker/agent platform: shared context, onboarding, permissions, guardrails, governance. |\n2026-02-10 |\n| P2 |\n[Introducing Claude Sonnet 4.6](https://www.anthropic.com/news/claude-sonnet-4-6) |\nAnthropic |\nAgents; Planning; Computer use |\nSonnet 4.6 emphasizes coding, computer use, long-context reasoning, agent planning. |\n2026-02-17 |\n| P2 |\n[Introducing Claude Opus 4.6](https://www.anthropic.com/news/claude-opus-4-6) |\nAnthropic |\nAgents; Long-running; Tool use |\nModel release perspective on long-running tasks, agentic harness, subagents, and tool call capabilities. |\n2026-02-25 |\n| P2 |\n[Introducing Claude Opus 4.7](https://www.anthropic.com/news/claude-opus-4-7) |\nAnthropic |\nAgents; Long-running; Coding |\nStronger software engineering and long-running task performance; track how model capabilities impact agent workloads. |\n2026-04-16 |\n| P2 |\n[An update on recent Claude Code quality reports](https://www.anthropic.com/engineering/april-23-postmortem) |\nAnthropic |\nReliability; Claude Code; Agent SDK |\nPostmortem on Claude Code/Agent SDK quality regression; learn agent product operations and regression control. |\n2026-04-23 |\n| P2 |\n[Introducing Claude Opus 4.8](https://www.anthropic.com/news/claude-opus-4-8) |\nAnthropic |\nAgents; Dynamic workflows; Long-running |\nDynamic workflows, hundreds of parallel subagents, long-running agentic tasks — latest model/product direction. |\n2026-05-28 |\n| P2 |\n[Codex for every role, tool, and workflow](https://openai.com/index/codex-for-every-role-tool-workflow/) |\nOpenAI |\nAgents; Codex; Plugins |\nCodex expands from development to knowledge work: role-specific plugins, Sites, annotations, parallel workflows. |\n2026-06-02 |\n| P2 |\n[Codex is becoming a productivity tool for everyone](https://openai.com/index/codex-for-knowledge-work/) |\nOpenAI |\nAgents; Knowledge work |\nUsage data shows how non-developers use Codex for reports, spreadsheets, research, automation, and lightweight tools. |\n2026-06-02 |\n| P2 |\n[OpenAI Docs MCP](https://developers.openai.com/learn/docs-mcp) |\nOpenAI |\nMCP; Docs; Context |\nOfficial OpenAI docs MCP server; connect docs directly to local agents/IDEs. |\nCurrent docs |\n| P2 |\n[Codex SDK](https://developers.openai.com/codex/sdk) |\nOpenAI |\nCodex SDK; Automation |\nProgrammatically control Codex in CI/CD or internal tools; embed coding agents into existing workflows. |\nCurrent docs |\n| P2 |\n[When AI builds itself](https://www.anthropic.com/institute/recursive-self-improvement) |\nAnthropic |\nAgents; Recursive self-improvement; Safety |\nHow AI systems accelerate their own development through recursive self-improvement; three possible futures and the need for verifiable coordination. |\n2026-05 |", "url": "https://wpnews.pro/news/agentic-engineering-handbook", "canonical_source": "https://github.com/keyuchen21/agentic-engineering-handbook", "published_at": "2026-06-13 02:35:51+00:00", "updated_at": "2026-06-13 02:49:52.098358+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "large-language-models", "ai-products", "ai-infrastructure"], "entities": ["OpenAI", "Anthropic", "Responses API", "Agents SDK", "AgentKit", "Codex", "Model Context Protocol", "ChatGPT"], "alternates": {"html": "https://wpnews.pro/news/agentic-engineering-handbook", "markdown": "https://wpnews.pro/news/agentic-engineering-handbook.md", "text": "https://wpnews.pro/news/agentic-engineering-handbook.txt", "jsonld": "https://wpnews.pro/news/agentic-engineering-handbook.jsonld"}}