Last week I audited my travel site's core dataset β legal-status data for 213 countries and every US state β and found the same fact stored in four places, drifting independently. I fixed all four copies, fixed the silent join that kept corrections from rendering, and made the headline counts derive from the data at build time. Audit closed. Confidence high.
Then I asked the agent to check one blog post β the flagship guide that covers the same topic as the dataset. Just to confirm it matched.
It didn't. And the ways it didn't match turned out to be worse than anything in the original audit, because this time the wrong facts weren't sitting in a data file waiting for a diff. They were sitting in sentences.
The flagship guide is half dynamic, half hand-written. The card grids and comparison tables pull from the data file I'd just fixed β so when the audit shipped, those healed automatically. A Central European country that fully legalized in January showed up in the "Legal" cards the moment the corrected data deployed.
Three scrolls down, the hand-written FAQ on the same page said that same country was "decriminalized β not legal," and used it as the textbook example of the difference.
One page. Two answers. The derived half was right, the asserted half was months behind, and no test could ever catch it because both halves were rendering exactly what they were told to render. There was also a link reading "See all 82 countries" pointing at a page that now lists 155.
A page that's half-derived and half-asserted will eventually contradict itself. It's not a risk, it's a schedule β the same schedule the four data copies were on, just running through paragraphs instead of rows.
That was the warm-up. The real find was a Southeast Asian country that famously opened up in 2022 and has been walking it back since.
My blog post about it, written in March, said: legal grey zone, walk into any shop, no paperwork needed, here's what to buy.
My country-profile page about it, updated later, said: fully re-criminalized, every shop closed, zero tolerance, tourists face arrest and four-digit fines β treat it like the strictest country on Earth.
Same site. Same country. Same month. Opposite advice β and here's the part that matters: both were wrong.
The verified reality (current primary sources, not model memory) sat exactly between them: the country had gone medical-only in mid-2025 β thousands of licensed shops still open, but buying now requires a prescription any tourist can get in a fifteen-minute consultation at the shop.
So one page was encouraging illegal behaviour. The other was denying a legal pathway that actually existed. The specifics of the law are just the evidence; either page, followed faithfully, failed the reader.
If I had noticed the contradiction without checking externally, the obvious move would have been to pick the newer page and sync the older one to it. That would have replaced a dangerous lie with a merely costly one and called it consistency. When two of your own copies disagree, neither one is the tiebreaker. Your copies can be wrong in opposite directions, and averaging them or trusting the newer one just launders the error. The only tiebreaker is outside the building.
Internal consensus isn't verification.
After that, I had the agent sweep every post on the site β about fifty-five of them β for anything shaped like a legal claim: statuses, ages, quantities, penalties, years.
Most survived. One didn't: a travel-safety post claimed a neighboring state had "decriminalized possession under three ounces in 2023," complete with a specific first-offense fine.
That state never decriminalized anything. It is, per my own freshly-audited dataset, in the strictest tier in the region β any possession is a criminal offense. The claim wasn't stale; there was no year in which it was true. It was invented, presumably by whatever wrote the first draft, and it survived every read-through since because it had everything plausibility needs: a number, a year, a dollar amount.
In the dataset audit, this failure mode β confident filler for unknown values β was at least constrained by a schema. A status column can only hold one of five values, and thirty identical suspicious values in a row is a visible pattern. Prose has no schema. A fabricated statute sits in a paragraph looking exactly like a researched one, and no uniformity tell will ever expose it. The same sweep also caught a legal age wrong by three years β duplicated, consistently, in two places, which is exactly why duplication felt like confirmation.
The last audit's lesson was "one fact, four copies." That count was wrong.
Every sentence on your site that states a fact is another copy of that fact. Unversioned, untyped, invisible to schema checks, excluded from every reconciliation you'll ever write β and, unlike the database, it's the copy your readers and the search engines actually consume. My four-copy inventory missed the largest and least maintained store of facts on the entire site: the writing.
What actually worked, if you want to reproduce the sweep:
The dataset audit took one long session. This follow-up took another β and found the errors with the highest actual cost to a reader, in the layer the first audit had implicitly certified as fine. The data was never the product. The sentences were.