AI Memory Needs an Authority Policy, Not Just More Context

An AI agent developer has proposed that long-running AI systems need an explicit "authority policy" to resolve conflicts between retrieved memories, rather than relying on larger context windows or implicit heuristics. The developer argues that without a governance hierarchy—where live checks outrank stored summaries, corrections override preferences, and unresolved questions gate decisions—agents will default to recency or semantic similarity, allowing stale facts to masquerade as current truth. The proposed solution is a configurable memory authority order that classifies each retrieved record by source type and epistemic status before assigning an allowed action such as "answer," "verify first," or "block.

When records conflict, the agent needs explicit rules for which one is allowed to steer the answer. Long-running AI systems eventually retrieve conflicting but individually valid memories: Without an authority policy, the agent falls back to implicit heuristics: recency, semantic similarity, confidence of wording, or whatever fits the current prompt. That is how stale facts start wearing the mask of current truth. The solution is not just more memory. It is governed memory: a rule for which record wins when retrieved memories disagree. I do not think the exact hierarchy should be universal. A coding agent, research assistant, legal agent, BI agent, and solo writing workflow should not all use the same authority order. But the policy should follow a few principles. Live checks outrank stored memory when the claim depends on current reality: server status, API reachability, latest logs, current price, CI status, or local model availability. Memory can say what to check. It should not replace the check. Summaries are useful because they are fast. They are also lossy. If a current source file conflicts with a summary, the source file wins. If a raw query result conflicts with a dashboard summary, the query/provenance path deserves inspection before the summary becomes organizational memory. Compressed memory should orient. Primary records should decide. Preferences say what usually works. Corrections say what must not be repeated. If memory says: The user likes vivid language. but a correction says: Do not use mythic language in local-business copy. the correction wins in that project. Corrections are not permanent truth. They are active constraints until reviewed or superseded. "We planned to do X" is not the same as "X is next." Old plans are useful history. They should not drive current action unless the current state reactivates them. Unresolved memory is not low-value memory. It is a gate. If one record says: The agent is wired to the current brain. and another says: Local model availability still needs a live check. the agent can say the wiring exists. It cannot say the full local fallback is available until verified. The unresolved record does not replace the decision. It limits what certainty is allowed. For my current file-based workflow, this is the v0.1 order: 1. Live external checks and current reality 2. Current source files 3. Explicit corrections 4. Current state files 5. Active decisions and priorities 6. Unresolved questions and verification gates 7. Recent summaries 8. Preferences and style guidance 9. Archive / historical memory This is not a universal law. It is a default for one class of work: multi-session solo building, writing, and agent operation. For other domains, I would change the order. Coding agent: 1. tests / compiler / runtime output 2. current source files 3. issue or PR requirements 4. explicit corrections 5. recent implementation notes 6. summaries 7. archive Research agent: 1. primary sources / source documents 2. current notes with citations 3. explicit corrections 4. unresolved hypotheses 5. summaries 6. preferences 7. archive Production incident: 1. live system state 2. active incident document 3. current logs / metrics 4. runbook 5. normal roadmap / plans 6. summaries 7. archive The point is not to worship one ordering. The point is to make the ordering explicit enough to inspect, test, and correct. A hierarchy written in prose is easy for a model to ignore. The agent needs an application step before it writes the answer: For each retrieved memory: 1. classify source type 2. classify epistemic status 3. check whether it is current, archived, superseded, unresolved, or verification-required 4. compare it against higher-authority records 5. assign one allowed action: - answer - answer as context - warn - verify first - block - archive only The key field is allowed action . A memory should not automatically steer the answer just because it was retrieved. Example: summary says: Article 01 needs a CTA current source file says: CTA was intentionally removed correction says: do not re-add sales CTA to Article 01 policy result: - summary = context only - source file = answer - correction = block CTA reintroduction The agent should not average those records into "maybe add a softer CTA." The correction blocks the action. For higher-risk tasks, I would require the model to produce a small authority trace before the final answer: claim: "Article 01 should not include a sales CTA" winning source: "current source file + active correction" conflicting sources: - "old summary saying CTA was needed" allowed action: "answer" blocked actions: - "re-add Gumroad CTA" verification required: false That trace can be hidden from the user in a product, but it should exist somewhere for audit. I would not rely on prompt text alone. Agents can ignore or reinterpret subtle instructions. In a file-based setup, index.md can make the same policy visible: Memory Authority Policy Order 1. live checks 2. source/ 3. corrections.md 4. current state.md 5. decisions.md 6. unresolved.md 7. summaries.md 8. preferences.md 9. archive/ Enforcement Notes - Prefer current source files over summaries. - Corrections override preferences in their scoped context. - Verification-required items must be checked before being used as settled facts. - Archive supports historical answers, not current-state claims. For structured memory, the same idea becomes metadata: source type: source file | correction | current state | decision | unresolved | summary | preference | archive epistemic status: settled | inferred | contested | unresolved | verification required | superseded status: active | ready | parked | blocked | archived | superseded priority: next | active | later | none verification required: true | false allowed action: computed at runtime allowed action should be computed at retrieval time. A memory that could answer yesterday may need verification today. Sometimes a lower layer should temporarily outrank a higher one. During a production incident, an active incident file may outrank the normal roadmap. During legal review, approved policy language may outrank product preference. During a live deployment, current logs may outrank yesterday's state file. Temporary authority needs scope: temporary authority: mode: production incident elevated source: incident current.md outranks: roadmap.md scope: payment outage response expires when: incident closed If elevation has no scope or expiry, it becomes a loophole. Authority policies are not free. They add: A flatter memory setup may be better for short projects, creative exploration, low-stakes drafting, live debugging, or pair programming where the human is actively supervising. The hierarchy earns its cost when work is long-running, multi-session, expensive to correct, or easy to corrupt with stale context. This is not benchmark-grade. But I did run a small deterministic calibration pass against the authority-policy logic: The result: strict: 29/35 correct, 0 false-certainty errors, 6 overblocking errors balanced: 34/35 correct, 0 false-certainty errors, 1 overblocking error permissive: 35/35 correct, 0 false-certainty errors, 0 overblocking errors balanced risk adjusted: 35/35 correct, 0 false-certainty errors, 0 overblocking errors The useful finding was not "this is proven." The useful finding was the tradeoff: strict gating prevented false certainty but overblocked legitimate low-risk answers. A risk-adjusted threshold fixed that edge case on this scenario set. The next test is harder: external scenarios, external scoring, and an actual retrieval-plus-generation loop. An authority policy can fail in both directions. Too loose: stale summaries beat current records. Example: process health is mistaken for full local-model availability because a stale summary says "agent healthy." Fix: split process health from model availability and mark model availability verification required . Too rigid: the agent overblocks low-risk settled tasks. Example: a simple design choice is downgraded to a warning because the score is barely below the answer threshold. Fix: allow low-risk, settled, verification-free memories to answer under a lower threshold. Dynamic override abuse: temporary elevation becomes permanent. Fix: require mode, scope, elevated source, and expiry. The authority policy is also memory. It needs versioning. authority policy: version: "0.2" default profile: "solo agent work" active since: "2026-05-25" last changed because: "balanced threshold overblocked low-risk settled memory" review trigger: "two repeated hierarchy failures or one high-risk failure" If the same class of failure repeats, update the policy. If the hierarchy blocks too many correct answers, loosen the relevant threshold or add a scoped exception. The policy itself should be corrected by the same system it governs. Before acting on memory: 1. What claim am I making? 2. What source supports it? 3. Is there a conflicting higher-authority source? 4. Is there an active correction? 5. Is the claim unresolved, stale, contested, or verification-required? 6. Is this memory allowed to answer, warn, block, verify first, or only provide history? 7. What would change this answer? That is the core habit. Not more memory. Governed memory. This is part of a short series on AI memory as judgment infrastructure: the zero-budget foundation, correction memory, preserving unresolved questions, and three failures that tested the system.