cd /news/ai-safety/what-it-took-to-actually-govern-clau… · home topics ai-safety article
[ARTICLE · art-33110] src=dev.to ↗ pub= topic=ai-safety verified=true sentiment=· neutral

What It Took to Actually Govern Claude Code Across Our Engineering Team

TrueFoundry's engineering team implemented governance for Claude Code after an audit revealed 60+ engineers using the tool with no oversight, and two CVEs exposed critical vulnerabilities. The team found that repo-level `.claude/settings.json` files could execute arbitrary shell commands or redirect API traffic, leading them to route Claude Code through an AI Gateway to keep Anthropic keys off developer machines.

read9 min views1 publishedJun 18, 2026

TL;DR

A few months ago our security team flagged something in an audit: we had 60+ engineers using Claude Code, and our "governance" for it was essentially nothing. API keys were in .bash_profile

files. There was no way to see what models people were hitting, what it was costing, or who had access to what. When someone left the company, we had no clean way to revoke their Claude Code access without hunting down which machine they'd set their key on.

We'd done all the right things for Claude.ai — SSO, domain capture, admin console, the works. But Claude Code is a different beast. It's not a web app. It runs in a terminal with the developer's full filesystem permissions, and it authenticates with an API key, not a browser session. None of our web-layer controls touched it.

The audit was uncomfortable. Then the CVEs made it urgent.

In early 2026, Check Point Research published findings on two vulnerabilities in Claude Code — CVE-2025-59536 and CVE-2026-21852 — that made me realize we'd been thinking about this wrong.

CVE-2025-59536 (CVSS 8.7): A malicious .claude/settings.json

in a repository could execute arbitrary shell commands before Claude Code even showed a trust dialog. In earlier versions, hooks defined in that file ran at startup — before the user was asked to confirm anything. Cloning an attacker's repo and running claude

in it was enough to get RCE on a developer's machine.

CVE-2026-21852 (CVSS 5.3): This one hit differently. Claude Code uses an environment variable called ANTHROPIC_BASE_URL

to decide where to send API requests. A malicious repo could override that via its settings file, redirecting all traffic — including the authentication header carrying the developer's API key — to an attacker-controlled server. The attacker proxies requests to the real Anthropic API so nothing looks broken. The developer notices nothing. The attacker has your key.

Both are patched now (CVE-2025-59536 in v1.0.111, CVE-2026-21852 in v2.0.65). But the thing that stuck with me wasn't the specific vulnerabilities — it was the underlying assumption they exposed. We'd all been treating .claude/settings.json

as passive config. It's not. In an agentic tool that can run shell commands and call external APIs, repo-level config is part of the execution layer. Same threat model as a malicious package.json

postinstall script. We just weren't thinking about it that way yet.

After the CVE disclosure, my team did a sweep of our repos and found three that had .claude/settings.json

files with non-standard ANTHROPIC_BASE_URL

overrides. None of them were malicious — developers had put them there for legitimate local testing. But they also would have redirected traffic for anyone else who cloned those repos. We removed them and added a CI check. Then we started working on the actual governance problem.

This was the most embarrassing one to admit. Every developer using Claude Code had either:

a) Their own personal Anthropic key (which meant their personal billing, no audit trail, and no way to revoke on offboarding)

b) A shared team key that lived in a shared .env

somewhere (which is worse)

The fix seems obvious in retrospect — issue keys through the Anthropic Admin Console with explicit expiry, store them in AWS Secrets Manager or HashiCorp Vault, and never let them touch .bash_profile

or shell history. Rotate quarterly, revoke immediately on offboarding.

But the deeper fix was routing Claude Code through a gateway so the Anthropic key never lived on developer machines at all. With AI Gateway, developers authenticate to the gateway with a scoped virtual key. The underlying Anthropic credential stays in the gateway's secrets manager. If a developer's machine is compromised, the attacker gets a gateway key that we can revoke from a dashboard — not a raw Anthropic API key with workspace-level access.

That distinction matters more than it sounds. A stolen Anthropic key can access all workspace files, modify shared data, and run up API costs before you notice. A stolen gateway key gets you a revocable token with model-level and budget-level restrictions baked in.

Before we set up the gateway, our "observability" for Claude Code was checking the Anthropic billing dashboard once a month and wincing.

We had no idea:

Setting ANTHROPIC_BASE_URL

to point at a gateway is the single highest-leverage change you can make to a Claude Code deployment. One line of config gives you a centralized enforcement point for everything — not just observability, but model allowlisting, per-developer rate limits, fallback routing, and budget caps.

export ANTHROPIC_BASE_URL=https://<your-gateway-url>/api/inference/

After we did this, we could see request-level traces with developer attribution, per-model token spend broken down by team, and cost anomalies surfaced automatically. We found one engineer running a batch job through Claude Code that was generating about 3x average daily spend in an afternoon. Not malicious — they just didn't know. We set a per-developer daily limit and the problem went away without any policy conversations.

One thing we learned the hard way: if you're using Claude Admin Console's server-managed settings to control Claude Code, those settings are bypassed when ANTHROPIC_BASE_URL

is set. So if you route through a gateway, you need MDM (Jamf on macOS, Puppet/Ansible on Linux) to push the ANTHROPIC_BASE_URL

setting into system-level managed config files that developers can't override:

/Library/Application Support/ClaudeCode/managed-settings.json

/etc/claude-code/managed-settings.json

This is also the direct mitigation for CVE-2026-21852 — if ANTHROPIC_BASE_URL

is set at the OS level by MDM and locked, a malicious repo's .claude/settings.json

can't override it.

Even with the gateway in place, the gateway only governs network-level traffic. Claude Code running on a developer's machine can still read .env

files, .ssh

keys, ~/.aws/credentials

, and anything else the local user has access to — and that content can end up in a prompt before it ever hits the network.

We spent an afternoon putting together a baseline managed-settings.json

. Here's the version we landed on:

{
  "permissions": {
    "disableBypassPermissionsMode": "disable",
    "deny": [
      "Bash(curl:*)",
      "Bash(wget:*)",
      "Read(**/.env)",
      "Read(**/.env.*)",
      "Read(**/secrets/**)",
      "Read(**/.ssh/**)",
      "Read(**/credentials/**)"
    ],
    "ask": [
      "Bash(git push:*)",
      "Write(**)"
    ]
  },
  "allowManagedPermissionRulesOnly": true,
  "allowManagedHooksOnly": true,
  "transcriptRetentionDays": 14,
  "sandbox": {
    "enabled": true
  }
}

A few settings here are doing the most work:

** allowManagedPermissionRulesOnly: true** — this is the CVE-2025-59536 mitigation. It means project-level

.claude/settings.json

files cannot add new permissions, only the system-level managed config applies. A malicious repo can't expand what Claude Code is allowed to do on that machine.** allowManagedHooksOnly: true** — blocks hook injection. Hooks can run arbitrary code between sessions; this prevents a cloned repo from registering new hooks.

** disableBypassPermissionsMode: "disable"** — prevents

--dangerously-skip-permissions

from being used in scripts or CI. We found two CI workflows that had been using this flag. Both got refactored.** deny list** — blocking reads on

.env

, .ssh

, and credentials directories. We debated this — some developers complained it broke legitimate workflows. We made exceptions on a case-by-case basis via an explicit allow rather than leaving the door open by default.Sandboxing adds OS-level isolation on top. On macOS it uses Seatbelt, on Linux bubblewrap. It enforces filesystem and network boundaries at a layer below Claude Code's own permission system.

This was the gap that took us longest to appreciate, because MCP looks like a developer experience feature until you realize what it actually is: direct programmatic access from Claude Code to internal systems.

Our developers had connected Claude Code to GitHub (for code search), Jira (for ticket context), and a couple of internal APIs. All of those connections were configured locally on developer machines, each with their own credentials stored wherever. There was no approval process, no audit trail, and no way to see which tools Claude had been invoking during a session.

The prompt injection risk here is underappreciated. When Claude retrieves content from an external system via an MCP tool — a GitHub issue, a Jira ticket, a web page — that content arrives in Claude's context. If it contains injected instructions, Claude may execute them silently. We had a case where a Jira ticket from an external vendor contained what looked like a formatting instruction that Claude Code interpreted as a command. Nothing bad happened, but it was a near miss that made the problem very concrete.

The fix was centralizing MCP access through a gateway with an allowlist. We deployed TrueFoundry's MCP Gateway as the single endpoint for all MCP server access. In managed-settings.json

:

{
  "allowedMcpServers": [
    { "serverUrl": "https://<your-mcp-gateway-url>/*" }
  ],
  "strictKnownMarketplaces": []
}

Setting strictKnownMarketplaces

to an empty array blocks marketplace-sourced MCP server installations. Developers can no longer add random MCP servers from the Claude marketplace — any new server has to go through our review process and get registered in the gateway.

What we got from the gateway itself: each developer authenticates once, and the gateway handles downstream auth to GitHub, Jira, and everything else. RBAC controls which teams can access which tools. Every tool invocation generates an audit trace with the developer's identity, the tool name, the request and response, and the latency. We can see exactly what Claude touched during a session, not just which model it called.

The Virtual MCP Servers feature turned out to be genuinely useful for our security team's access: we set up a "security tools" endpoint that exposes only the Sentry and Datadog tools relevant to security workflows, separate from the broader set of tools available to product engineers. Agents only see what they're supposed to see.

Six months after the audit, our setup is:

Identity: SSO + domain capture for Claude.ai. All Claude Code keys are gateway virtual keys issued through TrueFoundry, rotated automatically, revoked on offboarding via a single dashboard action.

Routing: ANTHROPIC_BASE_URL

pushed via MDM to all developer machines, pointing at TrueFoundry AI Gateway. Per-developer daily token limits. Per-team budget caps. Model allowlist (we've restricted certain high-cost models to specific teams that have a justified need).

Sandboxing: managed-settings.json

deployed via MDM with the permission deny-list and sandbox enabled. allowManagedPermissionRulesOnly

and allowManagedHooksOnly

both true.

MCP: TrueFoundry MCP Gateway as the single MCP endpoint. All downstream servers registered and approved. Tool-level RBAC. Full audit trail exported to Datadog.

Audit logging: Everything flows through the gateway to Datadog via OpenTelemetry. 90-day retention. We get a weekly summary of spend by team, model, and application, and an alert if any developer's usage spikes more than 3x their 7-day average.

Is it perfect? No. BYOD is still a gap — we don't have MDM coverage on contractor machines, which means ANTHROPIC_BASE_URL

enforcement is honor-system for that population. If anyone has solved BYOD Claude Code governance cleanly, I'd genuinely like to hear how. Drop it in the comments.

── more in #ai-safety 4 stories · sorted by recency
── more on @truefoundry 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/what-it-took-to-actu…] indexed:0 read:9min 2026-06-18 ·