cd /news/artificial-intelligence/from-theory-to-the-floor-what-happen… · home topics artificial-intelligence article
[ARTICLE · art-25909] src=dev.to pub= topic=artificial-intelligence verified=true sentiment=· neutral

From Theory to the Floor: What Happens When "Specificity-as-Integrity" Meets a Real Restaurant

A developer's Komiru project, a framework for AI integrity in local businesses, has moved from theory to a live pilot with a restaurant in Nagano, Japan. The pilot revealed that the problem is not just infrastructure but also habit-formation, as staff must maintain weekly data updates. Ongoing discussions with researcher Cheng highlight two unresolved challenges: the ease of fabricating hyper-specific data via automation, and the 'Yelp problem' where established platforms have more domain authority than new, authentic sources.

read4 min publishedJun 13, 2026

A few weeks ago I wrote about the information gap between what AI search engines confidently tell people and what is actually happening inside a local business right now. The response to that post — especially one exchange with a researcher named Cheng — pushed me somewhere I didn't expect to go this fast: out of the whiteboard and into an actual kitchen.

This is an update on where things stand, and on two open questions I still don't have good answers to.

Komiru is no longer just a framework on paper. We've started a live pilot with a real, operating local business in Nagano — not a demo environment, not a mockup, but a place with actual customers, actual staff, and a actual weekly rhythm of writing down what came in, what's running low, and what changed since last week.

I won't go into the mechanics of how the system works under the hood. That's deliberate. What I can say is that the shift from "this should work in theory" to "a person has to actually do this every week, in between serving customers" has been the most clarifying part of the whole project so far.

A few things became obvious almost immediately that no amount of whiteboarding surfaced:

None of this invalidates the original thesis. If anything, watching it happen in a real space made the thesis feel more urgent, not less. But it also reframed the problem: this isn't purely an infrastructure problem anymore. It's an infrastructure problem and a habit-formation problem, running on the same clock.

The other thing that's happened since the last post is an ongoing exchange with a researcher in Ireland who works on production-scale LLM deployment. I'm going to call him by his first name, Cheng, since that's how the conversation has felt — less like a review and more like an ongoing argument I'm grateful for.

Cheng raised two points that I haven't been able to stop thinking about, and I want to be honest that I don't think either is fully resolved.

The first is about fabrication. My original framing leaned on the idea that sustaining 52 weeks of internally consistent, hyper-specific false data would be too costly for a bad actor to bother with. Cheng's pushback was direct: that assumption is already out of date. Generating a year's worth of plausible, weather-adjusted, internally consistent "facts" via automation is not hard anymore. If specificity alone was supposed to be the integrity mechanism, it isn't enough on its own.

I think he's right, and I think the honest answer is that specificity was never meant to be a wall — it was meant to be a cost. The question I'm sitting with now is: what raises the cost further, without turning the whole system into a verification bureaucracy that defeats the purpose? I don't have a clean answer. I have some directions I'm exploring, but nothing I'd call a solution yet, and I'd rather say that plainly than pretend otherwise.

The second point is about trust — what Cheng called the "Yelp problem." Even with perfectly authentic, perfectly structured data, why would an LLM (or the retrieval system underneath it) prefer a small, newly-published source over the accumulated authority of an established platform? Domain authority isn't just a search ranking artifact — it's baked into how these systems reason about what's worth citing at all.

This one stings a bit more, because it's not something a better data format can fix. It's closer to a chicken-and-egg problem: the corpus needs time and consistency to earn trust, but trust is exactly what determines whether anyone — human or AI — ever encounters the corpus in the first place.

I don't have a tidy resolution to either of these, and I think that's the honest state of the project right now. What I do have is a live pilot that's forcing both questions to stop being abstract. Every week that passes is either evidence for the thesis or evidence against it, and for the first time that evidence is coming from a real place with real stakes, not from my own assumptions about how busy people behave.

If you've worked on problems at the intersection of provenance, trust calibration in retrieval systems, or getting non-technical people to sustain a data-producing habit over months — I'd genuinely like to hear from you. Cheng's questions opened up more than they closed, and I suspect the people who can help me think through them aren't all in one field. More updates as the weeks accumulate. That's rather the point.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/from-theory-to-the-f…] indexed:0 read:4min 2026-06-13 ·