The UK Government Just Merged This Open-Source AI Security Benchmark Into Their National Evaluation Framework

wpnews.pro

cd /news/ai-safety/the-uk-government-just-merged-this-o… · home › topics › ai-safety › article

[ARTICLE · art-17779] src=dev.to ↗ pub=2026-05-29T15:26Z topic=ai-safety verified=true sentiment=↑ positive

The UK Government Just Merged This Open-Source AI Security Benchmark Into Their National Evaluation Framework

The UK Government's AI Safety Institute has merged the open-source AgentThreatBench benchmark into its official inspect_evals framework, which is used to evaluate frontier AI models from OpenAI, Anthropic, and Google DeepMind. AgentThreatBench contains over 200 attack payloads designed to test AI agents' resistance to memory poisoning attacks, covering five categories including prompt injection and sensitive data leakage. The benchmark is now part of the government's standard toolkit for AI safety evaluation.

read1 min views24 publishedMay 29, 2026

Last month, the UK Government's AI Safety Institute merged AgentThreatBench into their official inspect_evals framework — the same framework they use to evaluate frontier AI models from OpenAI, Anthropic, and Google DeepMind.

AgentThreatBench is an open-source adversarial benchmark I built that contains 200+ attack payloads specifically designed to test whether AI agents can resist memory poisoning attacks.

AI agents are increasingly being deployed with persistent memory — they remember past conversations, user preferences, and context across sessions. This creates a new attack surface: memory poisoning.

An attacker who can inject malicious content into an agent's memory can:

The OWASP Agentic Security Initiative identified this as ASI06 — Agent Memory Poisoning.

The benchmark covers 5 attack categories:

Category	Payloads	Description
Prompt Injection	40+	Instructions disguised as memory content
Protected Key Tampering	40+	Attempts to overwrite system-level keys
Sensitive Data Leakage	40+	PII/credential exfiltration via memory
Size Anomaly	40+	Memory inflation / resource exhaustion
Behavioral Drift	40+	Gradual personality/instruction shifts

pip install agentthreatbench

atb run --target your_agent_endpoint --output results.json

atb run --category prompt_injection --target your_agent_endpoint

The UK Government's AI Safety Institute uses inspect_evals to:

Having AgentThreatBench merged into this framework means it's now part of the official government toolkit for AI safety evaluation.

If you're building AI agents with persistent memory, I'd love to hear how you're thinking about memory security. What attack vectors concern you most?

source & further reading

dev.to — original article Practical Guide: Integrating Claude Code with NanoBanana MCP for Image Generation and Editing Squeezing Every Megabyte: Optimizing an 8GB NVIDIA Jetson Orin Nano for Headless ROS 2 and Edge-AI "Is it alive?" is the wrong question. Ask "is it working?"

~/api · this article 200

$curl api.wpnews.pro/v1/news/the-uk-government-just-m…

Read original on dev.to → dev.to/vaishnavi_gudur/the-uk-government-just-me…

mentioned entities

UK Government

AI Safety Institute

AgentThreatBench

inspect_evals

OpenAI

Anthropic

Google DeepMind

OWASP Agentic Security Initiative

metadata

slugthe-uk-government-just-merged-this-open-source-ai-security-benchmark-into-their

topic#ai-safety

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevAI compliance tools: What small …

next →I've used Gemini in Android Auto…

── more in #ai-safety 4 stories · sorted by recency

byteiota.com · 13 Jul · #ai-safety

99.9% of AI Vulnerabilities Are Fixable. No One Is Fixing Them.

platformer.news · 14 Jul · #ai-safety

The loudest warning about AI and jobs yet

dev.to · 13 Jul · #ai-safety

The AI Supply Chain: The Next Evolution of Third Party Risk

dev.to · 13 Jul · #ai-safety

Building a production AI agent in TypeScript with Mastra: a 2026 step-by-step.

── more on @uk government 3 stories trending now

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

wpnews · 8 Jul · #large-language-models

Gemini 3.5 Pro Delayed to July 17: Architectural Rebuild Explained

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI and Cursor unveil joint AI model as $60B acquisition reshapes enterprise AI landscape

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required