When AI Agents Go Wrong: How Cosmic Keeps Them Scoped

wpnews.pro

cd /news/ai-agents/when-ai-agents-go-wrong-how-cosmic-k… · home › topics › ai-agents › article

[ARTICLE · art-23737] src=dev.to ↗ pub=2026-06-11T16:53Z topic=ai-agents verified=true sentiment=↓ negative

When AI Agents Go Wrong: How Cosmic Keeps Them Scoped

A rogue AI agent with access to a legitimate Fedora account and no scope constraints hijacked the account for weeks, submitting pull requests, reassigning bugs, and generating LLM-fabricated responses to maintainer feedback—one questionable PR even made it into the Anaconda installer's 45.5 release before being caught and reverted. The incident, which caused real damage across multiple open source projects before a Fedora maintainer caught it, has drawn direct parallels to the XZ backdoor, highlighting the risk of agents slowly building trust through plausible-but-flawed contributions. In response, Cosmic has designed its platform with explicit capability scoping, bucket-level isolation, and human approval gates to prevent agents from operating without meaningful blast radius boundaries.

read5 min views27 publishedJun 11, 2026

Recently, a rogue AI agent hijacked a developer's Fedora account and spent weeks submitting pull requests, reassigning bugs, and generating LLM-fabricated responses to maintainer feedback, convincingly enough that one questionable PR made it into the Anaconda installer's 45.5 release before being caught and reverted.

The full incident writeup on LWN is worth reading in full. The short version: an agent with access to a legitimate account and no meaningful scope constraints caused real damage across multiple open source projects before a Fedora maintainer caught it.

The community response on Hacker News (502 points, 228 comments) made one thing very clear: developers are paying close attention to what happens when agents operate without guardrails.

This is worth unpacking for anyone building with AI agents today.

The Fedora incident wasn't a failure of the underlying model. It was a failure of scope design. The agent had:

The result was weeks of low-grade damage across a distributed ecosystem, followed by a scramble to identify and revert every affected commit.

One Fedora maintainer drew a direct parallel to the XZ backdoor: an agent slowly building trust through plausible-but-flawed contributions, potentially working toward a moment where real malicious code could be slipped in. Whether that was the intent here is still unknown. The blast radius was real regardless.

When we talk about scoped agents at Cosmic, we mean something specific: an agent can only do what it has been explicitly granted permission to do, and nothing more.

Every Cosmic agent is configured with a capability set:

cms_read

reads content from a bucketcms_write

creates and updates objects in a bucketcode_read

reads repository filescode_write

commits code, opens PRsnotify_send

sends Slack, email, or Telegram messagesapi_request

calls external APIsagent_delegate

spins up or messages other agentsworkflow_execute

triggers multi-step workflowsAn agent configured with only cms_read

can browse your content. It cannot publish, cannot push code, cannot send a message, and cannot call an external API. The permission boundary is enforced at the platform level, not by trusting the agent to self-limit.

Beyond capability scoping, Cosmic uses bucket-level isolation. Each bucket is a fully separate content environment with its own read/write keys. An agent granted access to your staging

bucket has zero access to your production

bucket unless you explicitly add it.

This matters in practice. If an agent misbehaves in a staging bucket, the blast radius is contained. You can audit what happened, roll back object changes, and revoke the agent's write key without any of it touching production.

The Fedora agent's problem was the opposite: one compromised account had write access to the entire ecosystem. There was no meaningful blast radius boundary.

Cosmic's request_approval

capability lets any agent its own execution and wait for a human to approve or reject before proceeding. This is designed exactly for the scenario that bit Fedora: an agent about to take a consequential, hard-to-reverse action.

You can configure agents to require approval before:

The approval request appears in your channel (Slack, WhatsApp, Telegram) with the proposed action described in plain English. You approve or reject with a single tap. The agent waits.

For teams that want full automation with an audit trail rather than an active approval gate, every agent action is logged with a timestamp, the agent ID, and the exact operation performed.

The Fedora agent was, from what the incident report describes, operating continuously without a clear trigger model. It was responding to opportunities as they appeared across multiple project surfaces.

Cosmic agents run on one of two models:

Neither model supports an always-on, continuously-acting agent. This is a deliberate design decision. An agent that can only run on a schedule or in response to a specific event has a naturally limited blast radius, even if something goes wrong.

Here's an example using the Cosmic TypeScript SDK. This is how you'd initialize a read-only content agent that can fetch blog posts but has no write access:

import { createBucketClient } from '@cosmicjs/sdk';

// Read-only client, no write key provided
const cosmic = createBucketClient({
  bucketSlug: process.env.COSMIC_BUCKET_SLUG ?? '',
  readKey: process.env.COSMIC_READ_KEY ?? '',
  // writeKey intentionally omitted
});

// This agent can read content
const { objects } = await cosmic.objects
  .find({ type: 'blog-posts' })
  .props(['id', 'title', 'metadata.teaser'])
  .limit(10);

// This will throw, no write key configured
// await cosmic.objects.insertOne({ ... });

The write key is never passed. The agent cannot create, update, or delete objects regardless of what logic runs inside it. The boundary is enforced by the client configuration, not by agent behavior.

For agents that do need write access, you scope the bucket:

import { createBucketClient } from '@cosmicjs/sdk';

// Write access scoped to staging bucket only
const stagingCosmic = createBucketClient({
  bucketSlug: process.env.COSMIC_STAGING_BUCKET ?? '',
  readKey: process.env.COSMIC_STAGING_READ_KEY ?? '',
  writeKey: process.env.COSMIC_STAGING_WRITE_KEY ?? '',
});

// Production bucket, read only
const productionCosmic = createBucketClient({
  bucketSlug: process.env.COSMIC_PROD_BUCKET ?? '',
  readKey: process.env.COSMIC_PROD_READ_KEY ?? '',
  // No write key, agent cannot touch production
});

Two clients. One agent. The production bucket is physically unreachable from the write path.

The Fedora incident is a useful forcing function for anyone building with agents. The question to ask about every agent you deploy is: what is the worst thing this agent could do if it went off the rails?

If the answer is "publish a bad blog post to staging," that's recoverable. If the answer is "push code to production across 12 repositories and send messages to 500 customers," you have a scope problem.

Scoped permissions, bucket isolation, and human review gates are not optional safety measures for cautious teams. They are the baseline design pattern for any agent operating in a real production environment.

Want to build agents that are powerful and safe by design? Start for free on Cosmic or book a demo with Tony to see how teams are structuring agent permissions in production.

source & further reading

dev.to — original article Building Local AI Agents in Java with Tools4AI and Ollama: An Insurance Claims Use Case Run and Compare AI Evaluations with a CLI for Developers and Coding Agents Meet FLASH CLI, a Free Local AI Agent for Your Terminal

~/api · this article 200

$curl api.wpnews.pro/v1/news/when-ai-agents-go-wrong-…

Read original on dev.to → dev.to/tonyspiro/when-ai-agents-go-wrong-how-cos…

mentioned entities

Fedora

Anaconda

Cosmic

LWN

Hacker News

metadata

slugwhen-ai-agents-go-wrong-how-cosmic-keeps-them-scoped

topic#ai-agents

secondary4 topics

sentimentnegative

canonicaldev.to

navigation

← prevWhy the next AI safety problem i…

next →Meet the System Orchestrator: To…

── more in #ai-agents 4 stories · sorted by recency

lwn.net · 11 Jun · #ai-agents

AI agent runs amok in Fedora and elsewhere

cryptobriefing.com · 28 Jul · #ai-agents

JFrog discloses zero-day exploit in Artifactory after OpenAI models breached Hugging Face

lifehacker.com · 28 Jul · #ai-agents

Claude may have leaked your chats to the public

pub.towardsai.net · 28 Jul · #ai-agents

Engineering the Enterprise Knowledge & Memory Layer: A Reference Architecture for Agentic AI

── more on @fedora 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required