Your agent's memory should compute confidence, not store it

wpnews.pro

cd /news/artificial-intelligence/your-agent-s-memory-should-compute-c… · home › topics › artificial-intelligence › article

[ARTICLE · art-33187] src=dev.to ↗ pub=2026-06-18T19:54Z topic=artificial-intelligence verified=true sentiment=· neutral

Your agent's memory should compute confidence, not store it

A developer proposes a new approach to agent memory where confidence scores are recomputed from a graph structure rather than stored statically. The system uses an asymmetric formula that penalizes contradictions more heavily than it rewards corroboration, allowing confidence to dynamically reflect new information without model reruns. This design aims to prevent stale confidence scores that ignore later contradictions.

read4 min views32 publishedJun 18, 2026

Most agent memory stores a confidence score the way it stores everything else. You

write it once and it sits there. The agent decides a fact is worth 0.9, the store

keeps 0.9, and three weeks later, after something has contradicted that fact, the

store still hands back 0.9. Confidence was a number written at one moment and

never looked at again. It is stale, and nothing in the system knows it.

That is the quiet failure of pull memory. You query, it returns the closest

matches with whatever score they were saved at, and noticing that a fact has gone

soft is on you.

Recall takes the other path. Effective confidence is not a stored field. It is

recomputed from the graph every time you read, so a contradiction landing anywhere

drops the claim's confidence on the next query, with no model rerun and no human

in the loop.

The formula

It is plain arithmetic, on purpose. For a cell, the effective confidence is:

effective = clamp01( stated × calibration + support − challenge )

Support and challenge are not raw sums. Each is squashed through a saturation

curve with a different ceiling:

support = 0.15 × tanh(supportMass)

challenge = 0.60 × tanh(challengeMass)

The asymmetry is the whole point. Corroboration is cheap to manufacture, so

support saturates fast under a low ceiling: stack ten agreeing cells and you add

at most 0.15. Real contradiction is rare and informative, so challenge runs to a

0.6 ceiling. One honest contradiction can move a claim further than a pile of

agreement.

A worked example you can check

A fresh claim, stated 0.9, author with no track record yet, no support, no

challenge:

effective = clamp01(0.9 × 1 + 0 − 0) = 0.90 One contradiction lands from a source stated at 1.0, a challengeMass of 1.0:

challenge = 0.60 × tanh(1.0) = 0.457

effective = clamp01(0.90 − 0.457) = 0.44

The same claim now reads 0.44. Nobody edited it. A second contradiction pushes the

mass to 2.0:

challenge = 0.60 × tanh(2.0) = 0.578

effective = clamp01(0.90 − 0.578) = 0.32

Down to 0.32, and the original 0.9 is still on record, just demoted. Ten

supporting cells would have added at most 0.15. Cheap agreement barely moves it; a

real challenge moves it a lot.

Calibration, and one honest choice in it

Before support and challenge apply, the author's stated number is multiplied by a

calibration factor. An author contradicted before gets discounted, by how often

they were wrong times how confident they were when wrong, floored at 0.5 so it

never zeroes anyone out.

The honest detail is what it is not. It is not raw Brier scoring. Raw Brier also

punishes a humble author who hedges low on claims that turn out fine, and

punishing humility is the opposite of the incentive a memory system should create.

So the discount keys on overconfidence specifically, being wrong while sure.

Hedge honestly and you are not penalized. Claim 0.95 and get contradicted and you

are.

Why this beats a stored score

A vector store returns the score a chunk was embedded with. A flat notes file

returns whatever it says. Neither knows the fact was contradicted last Tuesday,

because the contradiction is not part of how the score is computed. The score and

the conflict live in different places.

In Recall they live in the same place. The contradiction is an edge on the graph,

and the score is computed from the graph, so the moment the edge exists the score

reflects it, on the next read, deterministically. The reader is the same agent

that wrote the memory, working from fresh context, and the substrate reprices what

it knows underneath it.

What it is not

This is a ranking signal, not a verdict on truth. A low effective confidence means

a claim is contested or comes from an author who has been wrong while sure, not

that it is false. The ceilings and curves are tunable defaults. And it is

deliberately deterministic arithmetic over the graph, not a model second-guessing

itself, which is what makes it inspectable: open any cell and see why its number

is what it is, term by term.

That is the trade. You give up a number that looks stable and never moves. You get

one you can recompute, that demotes a stale claim the instant the evidence turns,

and that you can read the reasons for. For an agent that has to act on what it

remembers, the second is worth more.

Recall is local-first, runs on SQLite, and sets up with one command. The code and

the formula above are open: github.com/H-XX-D/recall-memory-substrate

source & further reading

dev.to — original article Skills Were Only Half the Answer: Upgrading to Agents, Skills and Commands Building a Multi-Agent AI for Company LinkedIn Pages - Part 7: Building the Hook Agent Why QA Testing Is Important for AI-Generated Code

~/api · this article 200

$curl api.wpnews.pro/v1/news/your-agent-s-memory-shou…

Read original on dev.to → dev.to/hendrixxcnc/your-agents-memory-should-com…

mentioned entities

Recall

metadata

slugyour-agent-s-memory-should-compute-confidence-not-store-it

topic#artificial-intelligence

secondary3 topics

sentimentneutral

canonicaldev.to

navigation

← prevAsk HN: Languages in the Age of …

next →This is the New Intel E835 NIC L…

── more in #artificial-intelligence 4 stories · sorted by recency

discuss.huggingface.co · 3 Aug · #artificial-intelligence

At some point, “memory + persona” stops being an adequate description

discuss.huggingface.co · 3 Aug · #artificial-intelligence

Qxern-v6 - two LLMs that talk in latent space (32 tokens, zero code text) + a symbolic sidecar. Built in one week

dev.to · 3 Aug · #artificial-intelligence

Building a Multi-Agent AI for Company LinkedIn Pages - Part 7: Building the Hook Agent

scmp.com · 3 Aug · #artificial-intelligence

China’s DeepSeek beefs up agentic AI with ‘harness’ tests as V4 model jolts Silicon Valley

── more on @recall 3 stories trending now

wpnews · 2 Aug · #artificial-intelligence

I Ran 8 AI APIs Through the Same 50 Prompts — Here's the Real Cost Breakdown

wpnews · 2 Aug · #developer-tools

Agent-Browser – Browser Automation for AI

wpnews · 2 Aug · #artificial-intelligence

Payment Rail vs. Settlement Layer: What AEON's Coinbase x402 Partnership Actually Validates

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required