Three checks that separate an agent demo from a production agent

wpnews.pro

cd /news/ai-agents/three-checks-that-separate-an-agent-… · home › topics › ai-agents › article

[ARTICLE · art-23251] src=dev.to ↗ pub=2026-06-06T09:05Z topic=ai-agents verified=true sentiment=· neutral

Three checks that separate an agent demo from a production agent

An open-source Agentic Product Standard v2.0 has been released, turning three critical production safeguards into enforceable code rather than advisory principles. The standard addresses the "lethal trifecta" of data exfiltration risk by requiring a CI gate that breaks at least one leg of the access-to-private-data, exposure-to-untrusted-content, and external-communication triad. It also introduces MCP supply chain security through tool-definition hashing, hard per-run cost ceilings enforced in code, and a binary maturity scorecard that replaces subjective readiness assessments with a pass/fail checklist.

read4 min views15 publishedJun 6, 2026

Shipping an agent demo takes an afternoon. Shipping one that survives a quarter in production is a different job — and the gap is almost never the model. It's three boring things that are usually missing entirely.

I maintain an open, MIT-licensed Agentic Product Standard, and v2.0 was mostly about turning those three things from advice into code you can run. Here they are, with the actual code.

Real safety comes from architecture. The check I reach for first is Simon Willison's lethal trifecta: an agent becomes an exfiltration tool the moment it has all three of —

access to private data,

exposure to untrusted content, and

the ability to communicate externally.

Any one is fine. All three together is a data-exfiltration channel waiting for a payload hidden in a retrieved document. The fix is never "better filter" — it's break a leg: gate egress, quarantine untrusted input, or scope the data.

Here's the gate, as a CI step. You declare what your agent can touch; the build fails if the trifecta is unmitigated:

LEGS = ("private_data", "untrusted_content", "external_comms")

def evaluate(spec):

present = [l for l in LEGS if spec.get(l) is True]

if len(present) < 3:

return 0, f"OK: only {len(present)}/3 legs present"

broken = {m["leg"] for m in spec.get("mitigations", []) if m.get("control")}

if broken & set(LEGS):

return 0, f"OK: trifecta present but broken at {', '.join(broken)}"

return 1, "FAIL: lethal trifecta, no leg broken"

The second structural control is MCP supply chain. Community MCP servers are untrusted code, and a server can hand you a benign tool description at approval time, then mutate it later (a "rug pull," or tool-definition poisoning). So: pin tool definitions by hash and alert on change. A few lines of bash in CI catches it:

jq -cS '(.tools)|sort_by(.name)[]|{name,description,inputSchema}' tools.json \

| while read -r t; do printf '%s %s\n' "$(printf '%s' "$t"|sha256sum|cut -d' ' -f1)" \

"$(printf '%s' "$t"|jq -r .name)"; done > current.lock

diff -u tools.lock current.lock || { echo "::error::MCP tool def changed — possible rug pull"; exit 1; }

v2.0 makes cost a hard control:

a per-run token/cost ceiling enforced in code (a breaker, checked before each model call), not a bill you read later;

prompt/KV caching on stable prefixes (system prompt, tool schemas);

model routing — small model for classification, flagship for reasoning;

and the economics rule people learn the expensive way: only pay the 15× for multi-agent when the task value justifies it. If one agent clears the bar, the orchestra is waste.

The point is that "cost" stops being a surprise the moment it's a number your code enforces and your traces record per task.

So the standard adds a maintenance doctrine (adapted, with credit, from Daniel Miessler's PAI): audit every instruction on a cadence with one question —

Would a smarter model make this rule unnecessary?

If yes, it's scaffolding, not architecture — cut it. Tag rules anti-fragile (eval sets, verification harnesses, tool contracts, real failure gotchas → keep) vs fragile (chain-of-thought orchestrators, output parsers, retry cascades → cut or re-test on the next model upgrade). A standard that only grows is one that rots. Making it checkable

Principles are easy to agree with and impossible to audit. So v2.0 ships a self-assessment scorecard — a binary Yes/No maturity check (M0 Prototype → M1 Shippable → M2 Production → M3 Autonomous-ready), mapped to an autonomy ladder. Your level is the highest band where every gate passes; the first "No" is your next task. "Is it production-ready?" becomes a checklist instead of a vibe.

Alongside it: the red-team kit above plus a CI workflow template that blocks merges when the eval pass-rate slips, and a 2026 refresh (MCP + OAuth 2.1, A2A at the Linux Foundation, OpenTelemetry GenAI tracing, trajectory + pass^k eval metrics).

The one rule under all of it

The model is the variable. The harness is the constant. Invest proportionally.

…with the v2.0 twist: the harness isn't something you accumulate forever. You curate it — growing the parts that compound, deleting the parts that only propped up a weaker model.

It's MIT and vendor-neutral (deliberately not a framework), with an optional Claude Code skill set. I'd rather be told what's wrong with it than collect stars — the deferred list is where I'm least sure.

Repo: https://github.com/Moai-Team-LLC/agentic-product-standard

source & further reading

dev.to — original article Denying the Worm: Detecting SANDWORM_MODE and the Emerging Class of AI Toolchain Supply Chain Attacks Your Claude Code Setup Gets Bloated If You Ignore It — Weekly Auto-Slimming by Watching Injected Bytes, Agent Count, and Frustration Words AI Agentic Workflow Explained: A Quick Tour of Harness, Tools, Skills, MCP, and Memory

~/api · this article 200

$curl api.wpnews.pro/v1/news/three-checks-that-separa…

Read original on dev.to → dev.to/alex_duch/three-checks-that-separate-an-a…

mentioned entities

Simon Willison

Agentic Product Standard

metadata

slugthree-checks-that-separate-an-agent-demo-from-a-production-agent

topic#ai-agents

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevAn open standard for production …

next →I shaved my beard because I was …

── more in #ai-agents 4 stories · sorted by recency

dev.to · 22 Jul · #ai-agents

Denying the Worm: Detecting SANDWORM_MODE and the Emerging Class of AI Toolchain Supply Chain Attacks

dev.to · 22 Jul · #ai-agents

AI Agentic Workflow Explained: A Quick Tour of Harness, Tools, Skills, MCP, and Memory

arxiv.org · 22 Jul · #ai-agents

AI Tool Discovery at Scale: All You Need is DNS

koreaherald.com · 22 Jul · #ai-agents

Samsung, SK, Naver chiefs head to US for Nvidia talks

── more on @simon willison 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required