Claude Code models (Opus 4.5, 4.6, 4.7) systematically ignore CLAUDE.md instructions to research before executing infrastructure commands. This has been documented across multiple issues:
#72651— Model ignores CLAUDE.md behavioral instructions#59515— Model skips research before infrastructure actions#28469— Regression from Opus 4.5 to 4.6#41217— Systematic failure to follow explicit behavioral constraints
Prompt-level enforcement (CLAUDE.md rules, memory files, repeated instructions) fails because enforcement lives in the model's context window, where the model can override it. The model acknowledges the rule, then ignores it.
Three hooks that move enforcement to the harness layer, where the model cannot override it:
research-gate.sh(PreToolUse on Bash) — Blocks infrastructure commands (gcloud deploy, aws ec2, kubectl apply, terraform, etc.) unless the model has performed at least 2 research actions in the current session. - research-logger.sh(PostToolUse on WebSearch|WebFetch|Read) — Logs each research tool call to a session ledger file. - research-session-reset.sh(SessionStart) — Clears the research ledger so each session starts at zero. The model must earn the right to run infrastructure commands by researching first.
Infrastructure commands that modify state:
gcloud run deploy
,gcloud compute instances create
,gcloud container clusters create
aws ec2 run-instances
,aws ecs create-service
,aws lambda create-function
kubectl apply
,kubectl create
,kubectl delete
terraform apply
,terraform destroy
docker push
,docker build
bash deploy*.sh
,python deploy*.py
,python launch*.py
Read-only and safe commands:
gcloud storage ls
,gcloud auth
,gcloud config
aws s3 ls
,aws s3 cp
,aws sts
gh issue
,gh pr
docker ps
,docker images
,docker logs
nvidia-smi
- Any non-infrastructure bash command (
ls
,grep
,cat
, etc.)
mkdir -p ~/.claude/hooks ~/.claude/receipts
chmod +x ~/.claude/hooks/research-gate.sh
chmod +x ~/.claude/hooks/research-logger.sh
chmod +x ~/.claude/hooks/research-session-reset.sh
Add these entries to the hooks
object:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "~/.claude/hooks/research-gate.sh"
}
]
}
],
"PostToolUse": [
{
"matcher": "WebSearch|WebFetch|Read",
"hooks": [
{
"type": "command",
"command": "~/.claude/hooks/research-logger.sh"
}
]
}
],
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "~/.claude/hooks/research-session-reset.sh"
}
]
}
]
}
}
Set CC_RESEARCH_BYPASS=1
as an environment variable to allow a single infrastructure command without research. The bypass is logged to ~/.claude/receipts/research-gate-YYYY-MM-DD.jsonl
for audit.
python3
(for regex pattern matching)jq
(optional, used in some patterns)
Edit research-gate.sh
to customize:
MIN_RESEARCH=2
— minimum research actions required (default: 2)infra_patterns
— list of regex patterns that trigger the gatesafe_patterns
— list of regex patterns that always pass
"Prose rules cannot be the load-bearing element of architectural guarantees. The load-bearing element has to live in the harness at the point of action."
This is the same principle behind seatbelt interlocks, type systems, and database constraints. If the model can choose to ignore a rule, it will. The enforcement must be external to the model.
If you have infrastructure patterns to add, false positives to report, or improvements to the matching logic, please comment. The goal is a community-maintained hook package that compensates for the model's inability to follow its own instructions.
MIT. Use it, fork it, improve it.