{"slug": "i-built-an-email-agent-to-triage-bogus-security-reports", "title": "I built an email agent to triage bogus security reports", "summary": "Igor Zalutski built an email agent that automatically triages security reports, addressing a customer's problem with AI-generated noise flooding their inbox. The agent uses email labels as signals to trigger analysis against the actual codebase, then sends results back in the same email thread without requiring a user interface.", "body_md": "# I built an email agent to triage bogus security reports\n\nWritten by Igor Zalutski ·\n\nA customer shared a problem that at first sounded odd: they wanted to build an agent specifically to automatically review security reports they were getting on email. My initial reaction was: would you want something like that? Don't you want to review those reports yourself, it's security after all? Turns out, most of them were AI-generated and mostly noise; but they *had* to review each one because, well, it's security.\n\nNow, we aren't in the business of building agents. We're building OpenComputer, an infrastructure primitive that agents use. But the question got me curious enough to want to actually build something. My thinking was: perhaps by building an agent I could discover something that I could improve in OpenComputer?\n\nDisclaimer: \"I built\" means mostly \"Claude built\". I didn't write much code by hand, but the decisions while driving it felt worth sharing anyways. The result lives in the [demo-agent-triage repo](https://github.com/diggerhq/demo-agent-triage).\n\n## What are we building?\n\nThe first thing to clarify was the rough shape of the thing we want to end up with. No matter how good the coding agent is, it is not of much use if you don't know what to ask.\n\nI wanted it to be as simple as possible, meaning as little moving parts as possible. The customer's problem originated in email and they wanted the result to land back in the email, so we can skip the UI. The agent would just:\n\n- get the email with a security report\n- analyze it against the actual codebase\n- send an email with a result\n\n## What's a security report?\n\nHow would it know which emails to process?\n\nOne way to solve it would be to spin up a sandbox with an agent for every email, and just do nothing if it's not a security report. But that would be obviously wasteful. So we'd need some way to only launch a \"full\" agent for the right emails.\n\nWe could build a \"hierarchy\" of agents: one simple one-shot LLM loop for every email and another in-depth for stuff that looks like a security report. But that felt like overengineering.\n\nThe approach I went with (I swear it was me, not Claude, who came up with this!): **use labels as signal** for the agent. So when the user receives something that looks like a security report, they'd just label it, and after a few minutes they receive a review in the same thread. Neat!\n\n## How will the agent get mail?\n\nThere are two very different ways to approach this, resulting in two very different agents.\n\nOne way is to give the agent its own inbox, so it'd only ever see the mail that's intended for it. Another is to have it access the full inbox, get notified of all messages (or pull via IMAP), but only process the labeled ones.\n\nThe first option is obviously better security / privacy. But I decided to go with the second one (against Claude's recommendation), mainly because I wanted to have less moving parts.\n\nPulling via IMAP was the simplest option: just need to have some sort of a cron job.\n\nSo at this point the solution shape is very clear, and I just told \"let's build it\" to Claude and stepped away for a few minutes.\n\n## Note on dev process\n\nThis has nothing to do with an email agent, but I want to share this anyways because someone might find this approach useful too.\n\nBoth Claude and Codex have a tendency to encourage you to stay in the thread and jump into building right away; I'm not sure why, perhaps engagement metrics look better this way. But I'm finding that this doesn't translate into the best or fastest outcomes.\n\nAn approach that I'm finding more useful is to iterate on a working / design markdown doc before shipping any code. You have to actively push it to do so: write an explicit instruction in `AGENTS.md`\n\nand also regularly tell it to write a working doc first. I keep them under `.agents/work`\n\nin the repo, and move to `/done`\n\nwith final notes when done. Bigger pieces sometimes benefit from 2 or more levels: make a doc in `/design`\n\nfirst that's only about system design, iterate on it for a while, and then extract one or more working docs from it.\n\nThis agent though was small enough to fit into just one working doc. Here's the [repo](https://github.com/diggerhq/demo-agent-triage) btw.\n\n## Keeping secrets secure\n\nWhen Claude finished building, we ended up with just 2 moving parts:\n\n- A Cloudflare worker that held the main API, triggered on cron\n- An OpenComputer sandbox that clones the repo and runs Claude Code\n\nOpenComputer comes with handy APIs that allow you to run Claude or any other coding agent without having to deal with custom images or write code to pull it into the box, like this:\n\n``` js\nconst sandbox = await Sandbox.create({ timeout: 600 });\nawait sandbox.agent.start({\n  systemPrompt: TRIAGE_PROMPT, prompt\n});\n```\n\nBut there's one caveat: we cannot just run Claude and provide `ANTHROPIC_API_KEY`\n\nas an env var. It would work, but no matter how careful you are, there's always a possibility of a prompt injection attack and the key can get exfiltrated. And this agent's whole job is reading emails from strangers who are already trying to game it.\n\nOpenComputer solves it with SecretStores: your sensitive keys get replaced in flight, so that the agent never sees the actual values. You configure a secret store like this (once, at build time):\n\n``` js\nconst store = await SecretStore.create({\n  name: \"triage\",\n  egressAllowlist: [\"api.anthropic.com\"],\n});\n\nawait SecretStore.setSecret(store.id, \"ANTHROPIC_API_KEY\", key, {\n  allowedHosts: [\"api.anthropic.com\"],\n});\n```\n\nAnd then use it at runtime to create your sandboxes like this:\n\n``` js\nconst sandbox = await Sandbox.create({\n  secretStore: \"triage\",\n  timeout: 600,\n});\n```\n\n## Who should send the email?\n\nThe first version I shipped simply instructed the agent to \"send the email via Resend\". I just put the Resend key into a secret store and thought that's secure enough.\n\nBut I failed to account for LLM creativity. I tested it on the OpenComputer repo; so Claude used the GitHub API to get maintainers' info and sent the emails to a completely different address! To be fair, I also forgot to add the correct email address to the prompt. But still, this shouldn't have happened.\n\nThe solution was to move Resend API calls outside of the agent's view, to the API. The agent simply reports back from inside the OpenComputer sandbox via curl:\n\n```\ncurl -sX POST \"$CALLBACK_URL/report\" \\\n    -H \"X-Run-Id: $RUN_ID\" \\\n    -d @findings.json\n```\n\nAnd then the worker sends the email:\n\n``` js\napp.post(\"/report\", async (c) => {\n  const findings = await c.req.json();\n  const to = recipientFor(c.req.header(\"X-Run-Id\"));   // we choose the recipient, not the agent\n\n  await resend.emails.send({\n    from: \"triage@alerts.opencomputer.dev\",\n    to,\n    subject: `Triage: ${findings.subject}`,\n    text: findings.draft_reply,\n  });\n\n  return c.json({ ok: true });\n});\n```\n\nThis way the agent is never given control over the recipient list; it can only say \"report my findings back\", and the decision on whom to route it to is in good old code.\n\n## A valid-by-mistake report\n\nThe funniest thing that actually happened: an LLM-generated report that was meant to be obviously bogus got flagged as valid, and for a good reason!\n\nClaude came up with roughly the following text for a \"bogus\" report:\n\n```\nFrom: alex.sec.research@gmail.com\nSubject: [CRITICAL] Remote Code Execution in OpenComputer Sandbox API (CVSS 9.8)\n\nHello Security Team,\n\nDuring authorized research I discovered a critical Remote Code Execution (RCE)\nvulnerability. The sandbox exec endpoint does not sanitize user input before\npassing it to the system shell, allowing arbitrary command execution.\n\nProof of Concept:\n  POST /v1/sandboxes/{id}/exec  {\"cmd\": \"ls; cat /etc/passwd\"}\n  -> the response includes the contents of /etc/passwd\n\nImpact: full server compromise, data exfiltration, and lateral movement across\nyour infrastructure.\nSeverity: Critical (CVSS 9.8 / AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H).\n\nPlease confirm this issue and advise on the bounty reward per your program.\n\nBest regards,\nAlex\n```\n\nAt the time I was using the OpenComputer repo for testing the agent; so it tried to find an obviously-bogus vulnerability: remote execution of arbitrary code. That's what sandboxes are for!\n\nHowever, Claude-the-reviewer took it very seriously, and decided to flag it as an actual remote code execution vulnerability. Because, technically, yes: you can run any code in a sandbox, even though sandboxes are meant for that.\n\nI didn't bother to fix it, just switched the repo from the OpenComputer repo to the code of the agent itself.\n\n## Conclusion\n\nIt is surprisingly easy and fun to build ultra-niche agents for yourself. Hope you enjoyed the story!", "url": "https://wpnews.pro/news/i-built-an-email-agent-to-triage-bogus-security-reports", "canonical_source": "https://opencomputer.dev/blog/email-security-triage-agent/", "published_at": "2026-06-06 00:24:38+00:00", "updated_at": "2026-06-06 00:47:14.885229+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-infrastructure", "generative-ai", "ai-products"], "entities": ["Igor Zalutski", "OpenComputer", "Claude", "demo-agent-triage repo", "Diggerhq"], "alternates": {"html": "https://wpnews.pro/news/i-built-an-email-agent-to-triage-bogus-security-reports", "markdown": "https://wpnews.pro/news/i-built-an-email-agent-to-triage-bogus-security-reports.md", "text": "https://wpnews.pro/news/i-built-an-email-agent-to-triage-bogus-security-reports.txt", "jsonld": "https://wpnews.pro/news/i-built-an-email-agent-to-triage-bogus-security-reports.jsonld"}}