The Improvement Loop: How akm Keeps Your Agent Sharp A developer has built a multi-phase pipeline called `akm improve` that automatically audits and cleans AI coding agent memory stashes, which accumulate stale, redundant, or contradictory information over time. The pipeline runs on a schedule without manual intervention, evaluating asset quality, consolidating scattered memories, extracting structured facts, and mapping entity relationships — producing reviewable proposals before any changes are made to the stash. Performance benchmarks show the reflect phase can run as a direct LLM HTTP call, cutting per-asset latency from approximately 30 seconds to 6–10 seconds. This is part ten in a series about managing the growing pile of skills, scripts, and context that AI coding agents depend on. Part nine https://dev.to/itlackey/agents-that-remember-where-they-were-1koe covered workflow assets, vault assets, and the writable git stash. Part eight https://dev.to/itlackey/building-agent-knowledge-bases-that-actually-scale-23pb tackled multi-wiki support for structured research. Earlier parts addressed teams, distributed stashes, feedback scoring, and community knowledge. This one is about entropy. You ship a feature. Your agent writes several memories during the session — partial findings, a workaround, a note about the build step that kept failing. Those memories are accurate when written. Three sprints later, the workaround is no longer needed, two of the memories say slightly different things about the same subsystem, and the note about the build step refers to a CI config that was replaced. None of this is catastrophic. But it accumulates. After six months, a significant fraction of your stash is stale, redundant, or quietly wrong. You could audit it manually. In practice, you won't — the stash is too large, the relevance of any given memory is hard to assess without the context where it was created, and the judgment calls merge these two? promote this? delete that? are exactly the kind of work that's tedious for a human and tractable for an LLM. akm improve is the answer to that problem. It is a multi-phase pipeline that reads your stash, evaluates asset quality, consolidates scattered memories, extracts structured facts, and maps entity relationships — on a schedule, without manual intervention, producing proposals you can review before anything changes. akm improve is not a single LLM call. It is a sequenced pipeline where each phase produces inputs for the next. Reflect evaluates asset quality. For each asset in scope, the reflect pass reviews the content against usage signals — search hits, retrieval counts, feedback — and produces a quality assessment. Low-quality assets are flagged as candidates for improvement. Since 0.8.0, reflect can run as a direct LLM HTTP call instead of spawning an agent subprocess, which cuts per-call latency from ~30 seconds to ~6–10 seconds: | Reflect mode | Time per call | 69-ref run | |---|---|---| | agent CLI subprocess | ~30s | ~35 min | | sdk in-process | ~10–15s | ~12–17 min | | llm direct HTTP | ~6–10s | ~8–10 min | Distill turns observations from reflect into lesson proposals. Where reflect says "this skill is incomplete and frequently retrieved with poor satisfaction," distill produces a draft improvement — a new version of the skill, a supplementary lesson, or a deprecation proposal. These proposals go into the queue; nothing is written to your stash until you accept them. Consolidate handles the memory pool specifically. Your memory pool accumulates entries from agent sessions — akm remember calls, auto-captured observations, and task agent outputs. Consolidation groups related memories into chunks, sends each chunk to the LLM for a curation plan merge near-duplicates, promote high-signal items, delete redundant entries, surface contradictions , and executes those plans. The result is a smaller, cleaner memory pool and new stash promotions. Memory inference runs after consolidation. It takes the post-consolidation state and runs a lightweight factual extraction pass — pulling out atomic facts that did not make it into explicit memory entries. These become additional promotion candidates. In steady-state operation, memory inference yields around 60–70% 69.3% in a recent 24-hour window usable atomic facts on each pass. Graph extraction runs last, against the final post-improve state. It builds the entity-relation index that powers akm graph commands — which stash entries mention a given entity, which entities co-occur, and which assets produced zero entities quality-triage candidates via akm graph orphans . As of 0.8.0, extraction is incremental: only assets that changed during the improve run are re-extracted, and batches of four run in parallel by default. Each phase is independently enabled or disabled per profile. A quick profile runs reflect only. A memory-focus profile runs reflect and memory inference on memory and lesson types. A thorough profile runs all five phases and auto-syncs the result to your git-backed stash. The basic invocation: akm improve That runs all enabled phases on the full stash, scoped by your default improve profile. Before running for the first time, use --dry-run to see what would be processed without writing anything: akm improve --dry-run The dry-run output shows which assets are selected, in what order, and which phases would run. Nothing is written to state.db from the dry-run path — the improve result is flagged .dryRun: true and excluded from health metrics. To scope the pass to a specific asset type: akm improve memory memory pool only akm improve skill skills only akm improve skill:deploy one specific asset To add extra guidance for the pass — useful when you know a particular focus area is relevant: akm improve --task "focus on deduplication in the build tooling notes" To cap the number of assets processed highest-utility first by default : akm improve --limit 20 The asset selection order is: assets with recent feedback signals first, then high-retrieval-count assets with no feedback, then everything else. Use --require-feedback-signal to restrict the pass to assets that have received explicit feedback and skip the retrieval fallback entirely. A profile controls which phases run, which LLM connections are used, whether auto-sync fires at the end of the run, and the confidence threshold for auto-accepting proposals. Built-in profiles: | Profile | Phases | Auto-sync | Auto-push | |---|---|---|---| default | All five | Yes | Yes | thorough | All five, larger batches | Yes | Yes | quick | Reflect only | No | No | memory-focus | Reflect + memory inference, memory and lesson types only | No | No | Pass --profile to override for a single run: akm improve --profile quick akm improve --profile memory-focus Define custom profiles in your config under profiles.improve.