I Made My AI Rules Self-Verifiable — Here's How

wpnews.pro

cd /news/artificial-intelligence/i-made-my-ai-rules-self-verifiable-h… · home › topics › artificial-intelligence › article

[ARTICLE · art-45277] src=dev.to ↗ pub=2026-06-30T17:44Z topic=artificial-intelligence verified=true sentiment=↑ positive

I Made My AI Rules Self-Verifiable — Here's How

A developer created a self-verifying rule system for AI assistants after discovering that 15 rules for an AI assistant were followed zero times in 30 sessions. The system uses embedded markers like [✓THINK] in rule text, which can be counted via grep on transcripts, turning vague requirements into verifiable constraints. A 350-line Python script, config-health.py, runs as a Claude Code Stop hook to track rule compliance and surfaces pending verifications in a file read on next startup.

read3 min views1 publishedJun 30, 2026

The problem wasn't that my rules were bad. The problem was I had no idea if they were being followed.

I had 15 rules for my AI assistant. "Think before acting." "Check context before asking." "Capture learning after complex tasks." Good rules. Important rules.

After 30 sessions, I finally checked: my "think before acting" rule had been followed exactly zero times.

Not because the AI was ignoring it. Because the rule was a wish — not a verifiable constraint. It had no teeth.

Test-Driven Development taught us a hard lesson: requirements without assertions aren't requirements. A test that always passes is useless. But a requirement without a test is just a wish.

Every software team learned this. CI pipelines enforce it. Linters check it. But AI configuration rules? We write them like sticky notes on a monitor — good intentions with no verification mechanism.

I realized my rules were all wishes. Every single one.

Here's the before and after:

Think before acting on any request.

Think before acting → mark [✓THINK] when done.

That [✓THINK]

isn't decoration. It does two things simultaneously:

grep -c '\[✓THINK\]' transcript.jsonl

becomes a health metric. No AI required.The marker is the assertion. The grep is the test runner. And the rule text itself is the specification — self-contained, self-verifying.

The whole system boils down to three lines of logic:

Rule text     → contains its own success criteria ([✓MARKER])
Transcript    → machine-readable evidence (regex-countable)
Health script → mechanical verification (no AI inference)

I wrote config-health.py

— 350 lines of Python, zero dependencies, stdlib only. It runs as a Claude Code Stop hook. Every session, it:

[✓THINK]

, [✓CONTEXT]

, [✓DELIVERY]

)pending-verifications.md

file — rules that haven't been followed show up as pending itemsThe script never blocks. It only reports. Exit code 0, always. But the data it produces changes behavior — because I read pending-verifications.md on next startup.

After deploying this: my rule execution rate went from "I have no idea" to tracked visibility across every session. I still miss rules sometimes. But now I know when I miss them. The dashboard surfaces exactly which rules need attention.

The surprising part: the markers themselves changed how the AI behaves. [✓THINK]

is a concrete action — not a vague "consider the problem," but a specific thing to output. The AI hits it more often because it's clearer what "done" looks like.

Any rule without an embedded success criterion is a wish. Wishes don't execute.

This applies beyond AI config. Your team's code review checklist? If "check for security issues" doesn't have a verifiable step, it's not happening. Your personal productivity system? If "review weekly goals" isn't measurable, you're not reviewing them.

Verifiability isn't about distrust. It's about closing the loop between intention and evidence.

Pick one rule in your AI config. Add a verification marker. Write 5 lines of grep to count it:

grep -c '\[✓YOUR_MARKER\]' ~/.claude/transcripts/*.jsonl

You'll learn more from that grep output than from 10 new rules you never verify.

The config-health.py script is part of delivery-gate — a dual-layer mechanical gate for Claude Code. MIT licensed. The methodology behind it lives at checkgrow.

source & further reading

dev.to — original article Drive SaaS trial lifecycle emails with an agent I Spent $50K on AI APIs Last Year — Here's What I'd Do Differently as a... Don't Ask AI to Stop Guessing. Design a System Where It Doesn't Need To.

~/api · this article 200

$curl api.wpnews.pro/v1/news/i-made-my-ai-rules-self-…

Read original on dev.to → dev.to/yuhaolin2005/i-made-my-ai-rules-self-veri…

mentioned entities

Claude Code

Python

config-health.py

delivery-gate

checkgrow

metadata

slugi-made-my-ai-rules-self-verifiable-here-s-how

topic#artificial-intelligence

secondary2 topics

sentimentpositive

canonicaldev.to

navigation

← prevDrive SaaS trial lifecycle email…

── more in #artificial-intelligence 4 stories · sorted by recency

loomcycle.dev · 30 Jun · #artificial-intelligence

Agents and humans on the same chunks. How v1.5.0 made co-authoring possible.

developers.googleblog.com · 30 Jun · #artificial-intelligence

Build reliable multi-agent applications with ADK Go 2.0. Discover our new graph-based workflow engine, built-in human-in-the-loop, and dynamic orchestration

dev.to · 30 Jun · #artificial-intelligence

Don't Ask AI to Stop Guessing. Design a System Where It Doesn't Need To.

dev.to · 30 Jun · #artificial-intelligence

The Bug That Made My Pentesting Agent Give Up

── more on @claude code 3 stories trending now

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 29 Jun · #large-language-models

The Silent Cost of AI Agents: Why Your Next.js SaaS Is Burning Money on LLM Calls

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required