{"slug": "your-agent-s-memory-should-compute-confidence-not-store-it", "title": "Your agent's memory should compute confidence, not store it", "summary": "A developer proposes a new approach to agent memory where confidence scores are recomputed from a graph structure rather than stored statically. The system uses an asymmetric formula that penalizes contradictions more heavily than it rewards corroboration, allowing confidence to dynamically reflect new information without model reruns. This design aims to prevent stale confidence scores that ignore later contradictions.", "body_md": "Most agent memory stores a confidence score the way it stores everything else. You\n\nwrite it once and it sits there. The agent decides a fact is worth 0.9, the store\n\nkeeps 0.9, and three weeks later, after something has contradicted that fact, the\n\nstore still hands back 0.9. Confidence was a number written at one moment and\n\nnever looked at again. It is stale, and nothing in the system knows it.\n\nThat is the quiet failure of pull memory. You query, it returns the closest\n\nmatches with whatever score they were saved at, and noticing that a fact has gone\n\nsoft is on you.\n\nRecall takes the other path. Effective confidence is not a stored field. It is\n\nrecomputed from the graph every time you read, so a contradiction landing anywhere\n\ndrops the claim's confidence on the next query, with no model rerun and no human\n\nin the loop.\n\nThe formula\n\nIt is plain arithmetic, on purpose. For a cell, the effective confidence is:\n\neffective = clamp01( stated × calibration + support − challenge )\n\nSupport and challenge are not raw sums. Each is squashed through a saturation\n\ncurve with a different ceiling:\n\nsupport = 0.15 × tanh(supportMass)\n\nchallenge = 0.60 × tanh(challengeMass)\n\nThe asymmetry is the whole point. Corroboration is cheap to manufacture, so\n\nsupport saturates fast under a low ceiling: stack ten agreeing cells and you add\n\nat most 0.15. Real contradiction is rare and informative, so challenge runs to a\n\n0.6 ceiling. One honest contradiction can move a claim further than a pile of\n\nagreement.\n\nA worked example you can check\n\nA fresh claim, stated 0.9, author with no track record yet, no support, no\n\nchallenge:\n\neffective = clamp01(0.9 × 1 + 0 − 0) = 0.90\n\nOne contradiction lands from a source stated at 1.0, a challengeMass of 1.0:\n\nchallenge = 0.60 × tanh(1.0) = 0.457\n\neffective = clamp01(0.90 − 0.457) = 0.44\n\nThe same claim now reads 0.44. Nobody edited it. A second contradiction pushes the\n\nmass to 2.0:\n\nchallenge = 0.60 × tanh(2.0) = 0.578\n\neffective = clamp01(0.90 − 0.578) = 0.32\n\nDown to 0.32, and the original 0.9 is still on record, just demoted. Ten\n\nsupporting cells would have added at most 0.15. Cheap agreement barely moves it; a\n\nreal challenge moves it a lot.\n\nCalibration, and one honest choice in it\n\nBefore support and challenge apply, the author's stated number is multiplied by a\n\ncalibration factor. An author contradicted before gets discounted, by how often\n\nthey were wrong times how confident they were when wrong, floored at 0.5 so it\n\nnever zeroes anyone out.\n\nThe honest detail is what it is not. It is not raw Brier scoring. Raw Brier also\n\npunishes a humble author who hedges low on claims that turn out fine, and\n\npunishing humility is the opposite of the incentive a memory system should create.\n\nSo the discount keys on overconfidence specifically, being wrong while sure.\n\nHedge honestly and you are not penalized. Claim 0.95 and get contradicted and you\n\nare.\n\nWhy this beats a stored score\n\nA vector store returns the score a chunk was embedded with. A flat notes file\n\nreturns whatever it says. Neither knows the fact was contradicted last Tuesday,\n\nbecause the contradiction is not part of how the score is computed. The score and\n\nthe conflict live in different places.\n\nIn Recall they live in the same place. The contradiction is an edge on the graph,\n\nand the score is computed from the graph, so the moment the edge exists the score\n\nreflects it, on the next read, deterministically. The reader is the same agent\n\nthat wrote the memory, working from fresh context, and the substrate reprices what\n\nit knows underneath it.\n\nWhat it is not\n\nThis is a ranking signal, not a verdict on truth. A low effective confidence means\n\na claim is contested or comes from an author who has been wrong while sure, not\n\nthat it is false. The ceilings and curves are tunable defaults. And it is\n\ndeliberately deterministic arithmetic over the graph, not a model second-guessing\n\nitself, which is what makes it inspectable: open any cell and see why its number\n\nis what it is, term by term.\n\nThat is the trade. You give up a number that looks stable and never moves. You get\n\none you can recompute, that demotes a stale claim the instant the evidence turns,\n\nand that you can read the reasons for. For an agent that has to act on what it\n\nremembers, the second is worth more.\n\nRecall is local-first, runs on SQLite, and sets up with one command. The code and\n\nthe formula above are open: github.com/H-XX-D/recall-memory-substrate", "url": "https://wpnews.pro/news/your-agent-s-memory-should-compute-confidence-not-store-it", "canonical_source": "https://dev.to/hendrixxcnc/your-agents-memory-should-compute-confidence-not-store-it-c2a", "published_at": "2026-06-18 19:54:13+00:00", "updated_at": "2026-06-18 20:29:38.263464+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-agents", "machine-learning", "ai-research"], "entities": ["Recall"], "alternates": {"html": "https://wpnews.pro/news/your-agent-s-memory-should-compute-confidence-not-store-it", "markdown": "https://wpnews.pro/news/your-agent-s-memory-should-compute-confidence-not-store-it.md", "text": "https://wpnews.pro/news/your-agent-s-memory-should-compute-confidence-not-store-it.txt", "jsonld": "https://wpnews.pro/news/your-agent-s-memory-should-compute-confidence-not-store-it.jsonld"}}