This a16z-backed startup says the fix for AI errors is a weaker model, not a smarter one

wpnews.pro

cd /news/large-language-models/this-a16z-backed-startup-says-the-fi… · home › topics › large-language-models › article

[ARTICLE · art-29577] src=thenextweb.com ↗ pub=2026-06-16T13:37Z topic=large-language-models verified=true sentiment=· neutral

This a16z-backed startup says the fix for AI errors is a weaker model, not a smarter one

A startup called Probably has raised $9 million in a seed round co-led by Andreessen Horowitz and Accel to catch AI factual errors using a weaker model instead of a smarter one. The company's 'verifiable data agent' runs answers through a deterministic validator that checks against actual data, aiming for 99.99% accuracy while reducing costs and running locally on open-source database DuckDB.

read3 min views26 publishedJun 16, 2026

Most of the AI industry is trying to fix hallucinations by building bigger, smarter models. A startup called Probably is betting on the opposite.

The company has raised $9m in a seed round co-led by Andreessen Horowitz and Accel, with Tokyo Black and Vermilion Cliffs Ventures, to catch AI’s factual errors before they ever reach a user. It is aiming for the 99.99% accuracy that ordinary software takes for granted but large language models rarely hit.

Its trick is to lean on the model less, not more. Probably’s first product, a local ‘verifiable data agent’ that answers questions from messy datasets, runs each answer through what founder Peter Elias calls a ‘data science mech suit’.

A harness, not a bigger brain #

The model takes a first pass, then a separate, deterministic validator checks the answer against the actual data and bounces anything that does not match. The model is trained against that validator, and every result ships with a citation and an audit trail.

‘The better your harness engineering is, the weaker the model can be,’ Elias says. Reduce the ambiguity enough, the argument goes, and the AI barely has to think.

That has a striking consequence for cost. Probably’s tool runs on a model Elias describes as ‘four classes weaker’ than the frontier, small enough to run on a desktop rather than a data centre, which strips out most of the token bill.

It also doubles as a privacy pitch. The whole thing runs locally on the open-source database DuckDB, and the company says the model only ever sees metadata and statistics, never the raw data, which stays on your machine.

Aimed at the token-cost backlash #

The timing is pointed. Companies are watching AI bills balloon even as per-token prices collapse, and a tool that delivers accuracy on cheap, local hardware speaks directly to that anxiety.

It also lands where errors hurt most. Probably says the same engine could extend to accounting or medical work, any ‘precision-sensitive’ job, the kind where a confident wrong answer is the whole problem, as researchers warning about hallucinations in science keep pointing out.

A provocative claim, and the catch #

Elias goes further, arguing the big labs have not built this because ‘they make money the more times you have to correct the model’. It is a tidy sales line, and a contestable one: the major labs pour resources into cutting hallucinations, and a smaller player has every reason to cast itself as the honest broker.

The bigger caveat is scope. A validator only works when there is a hard ground truth to check against, such as a dataset, which is why Probably started with data rather than open-ended writing. It is a $9m seed, the product is in public preview at version 0.1, and the 99.99% figure is still a goal, not a result. But in a market crowded with attempts to tame hallucinations, betting on smaller models is at least a refreshingly different wager, and one a16z and Accel were willing to fund.

Get the most important tech news in your inbox each week.

source & further reading

thenextweb.com — original article The Real AI Advantage Begins When Entrepreneurs Stop Limiting Their Own Potential GM is building its own in-vehicle AI assistant because Google’s Gemini cannot access what the car knows Fusion’s best-funded bet raised another $1bn, and hired the banker who took Moderna public

~/api · this article 200

$curl api.wpnews.pro/v1/news/this-a16z-backed-startup…

Read original on thenextweb.com → thenextweb.com/news/probably-9m-seed-a16z-ai-hal…

mentioned entities

Probably

Andreessen Horowitz

Accel

Peter Elias

DuckDB

Tokyo Black

Vermilion Cliffs Ventures

metadata

slugthis-a16z-backed-startup-says-the-fix-for-ai-errors-is-a-weaker-model-not-a-one

topic#large-language-models

secondary4 topics

sentimentneutral

canonicalthenextweb.com

navigation

← prevMCP Gateway: What It Is and Why …

next →MCP OAuth: Connecting Agents to …

── more in #large-language-models 4 stories · sorted by recency

runtimewire.com · 30 Jul · #large-language-models

Thinking Machines Lab ships Inkling-Small with open weights and lower compute costs

cryptobriefing.com · 16 Jun · #large-language-models

Probably raises $9M to build reliable AI systems that don’t hallucinate

techcrunch.com · 16 Jun · #large-language-models

Probably raises $9M to build a more reliable kind of AI

machinebrief.com · 31 Jul · #large-language-models

Amazon completes $50bn investment in OpenAI

── more on @probably 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 31 Jul · #ai-products

E J Ziyad launches UML, a shared memory graph for Claude and ChatGPT

wpnews · 31 Jul · #artificial-intelligence

OpenAI Slashes GPT-5.6 Prices as Tech Giants Wage War Over Enterprise AI Spending

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required

This a16z-backed startup says the fix for AI errors is a weaker model, not a smarter one

A harness, not a bigger brain #

Aimed at the token-cost backlash #

A provocative claim, and the catch #

Get the TNW newsletter #

Run your AI side-project on zahid.host