{"slug": "datadogs-5-tips-for-building-ai-agents", "title": "Datadog’s 5 Tips for Building AI Agents", "summary": "Datadog Director of Engineering Diamond Bishop outlined five key lessons for building production-ready AI agents at the MCP Dev Summit North America 2026, based on the company's experience deploying its first 100 agents across SRE, code generation, and security investigation. Bishop emphasized that the hardest challenges were not related to model quality but to observability, treating agents as a distinct user class, running them in background event-driven systems, and building evaluative feedback loops before launch. The company now runs three general availability agents—Bits.ai SRE, Bits.ai Dev, and a Security Analyst—that operate without human intervention, using Temporal for durability and requiring organizations to design interfaces that agents can use directly.", "body_md": "*A talk by Diamond Bishop at MCP Dev Summit North America 2026*\n\n**TLDR:** Getting an AI agent to work is a different problem from getting it to run in production without someone watching it. Datadog built its first hundred agents across SRE, code generation, and security investigation. The hardest parts had nothing to do with model quality.\n\nDatadog’s whole business is watching systems fail in real time. Building agents it couldn’t observe was not a comfortable position to be in. Diamond Bishop, Director of Eng/AI, walked through how they got from 0 to 100 production agents and what has to be accomplished before the next thousand.\n\nThese are the five lessons that stuck.\n\n## 1. Treat agents as your first customers\n\nDatadog already has an[ MCP server](https://github.com/modelcontextprotocol/servers) that lets external agents query its platform directly. That’s table stakes now. The more interesting point was about what teams aren’t doing.\n\nMost product and UX teams still design entirely for human users. Meanwhile, developers are quietly building workarounds to make those same interfaces machine-readable. The gap is real and it’s widening. The fix is organizational: **get design teams thinking about agents as a user class before engineers have to compensate for them**.\n\nThe framing Datadog uses internally is a riff on the old Bezos API mandate. Every interface inside your company should be something an agent can use. If a task can only be completed through a UI built for humans, that’s a gap in your platform, not a feature.\n\n## 2. Run agents in the background, not on your laptop\n\nThe agents that are actually earning their keep at Datadog aren’t the chat-based ones. They’re the ones running quietly in the background on real events.\n\nDatadog has three agents in GA today.\n\n**Bits.ai SRE**: autonomous alert investigator. It fires when something breaks and traces the issue before the engineer gets to their desk in the morning.**Bits.ai Dev**: watches for errors and latency problems in live services and proposes code fixes without waiting to be asked.** Security Analyst**: works through investigation checklists on concerning alerts automatically, handling the repetitive triage work that humans were doing by hand.\n\nAll three share the same architectural requirement: they run without a human in the loop. That means they need to be event-driven, containerized, and durable. Datadog runs these with[ Temporal](https://temporal.io) for durability. Running long-lived agents on local machines is a fast path to fragile systems.\n\n## 3. Don’t ship an agent you can’t measure\n\nIf you don’t have an eval system before you launch, don’t launch.\n\nDatadog runs offline eval, online eval, and a living eval system that updates as behavior drifts. Models change, data drifts, an agent that performed well in testing will eventually diverge from production reality without a feedback loop catching it.\n\nThe practical extension of this is making your eval system itself agent-accessible. Expose it through an MCP server, let an agent work the improvement loop, and you get a system that can get better on its own over time.\n\n## 4. Build to rewrite, not to preserve\n\nModel rankings flip faster than most teams expect. A few months ago the conventional wisdom in some circles was that Anthropic’s models had plateaued. Then Claude came back. Now Codex is getting attention again. No one knows where it stabilizes.\n\nThe practical response is to build agent harnesses that don’t assume a specific model or framework. Keep them simple enough that swapping a model out isn’t a refactor project. The thing worth preserving isn’t the harness itself, it’s the memory system holding the accumulated knowledge your agents have gathered. That’s what lets you carry learnings forward when the underlying model changes.\n\n## 5. Multiplayer means more than it used to\n\n“Multiplayer” used to mean multiple humans working in the same space at the same time. That’s no longer the only configuration that matters. At Datadog, production agent systems run three distinct pairings:\n\n- Human working with human\n- Human working with agent\n- Agent working with agent\n\nEach one needs its own communication patterns. You can’t design for one and assume the others work.\n\nThe “who watches the watchman” problem is real. An agent that monitors another agent still needs oversight somewhere. Designing that layer in explicitly, rather than hoping it works out, is what separates a proof of concept from a production system.\n\n*Diamond Bishop is Director of Eng/AI at Datadog. The Agentic AI Foundation is the home of open agentic standards and open source infrastructure. To learn more about MCP and connect with engineers thinking through these problems, visit aaif.io, join the conversation in the AAIF Discord, or join us at an upcoming AAIF event.*", "url": "https://wpnews.pro/news/datadogs-5-tips-for-building-ai-agents", "canonical_source": "https://aaif.io/blog/datadogs-5-tips-for-building-ai-agents/", "published_at": "2026-05-28 19:50:04+00:00", "updated_at": "2026-06-05 04:43:11.454045+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "ai-products", "mlops"], "entities": ["Datadog", "Diamond Bishop", "MCP Dev Summit North America 2026", "MCP server", "Amazon"], "alternates": {"html": "https://wpnews.pro/news/datadogs-5-tips-for-building-ai-agents", "markdown": "https://wpnews.pro/news/datadogs-5-tips-for-building-ai-agents.md", "text": "https://wpnews.pro/news/datadogs-5-tips-for-building-ai-agents.txt", "jsonld": "https://wpnews.pro/news/datadogs-5-tips-for-building-ai-agents.jsonld"}}