Claude Code users know the pain of repetition. Each "cold start" means restating goals, context, and code history—wasting tokens and patience. The Recall plugin for Claude Code durable memory eliminates this friction entirely offline. No API keys, no external calls, no hidden costs. Instead, Recall persistently and privately saves your session context, so every project picks up exactly where you left off. If you're frustrated with throwing away credits and time, Recall provides a zero-cost, privacy-first solution to keep your AI workflow humming—saving tokens and keeping your project discussions out of the cloud.
Recall is a free, offline plugin designed for developers running Claude Code locally. Its sole purpose: provide persistent session memory for Claude Code in a way that’s privacy-respecting and costs nothing extra. How does it work? Recall keeps a local append-only log of all your sessions. Every prompt, reply, file touched, and command run gets captured to disk, never leaving your machine.
Instead of needing to re-explain your goals and project structure every time you open Claude Code, Recall generates a concise summary from this log. That summary—built entirely on your local machine using a classical Python summarizer, not a large language model—gets injected into your next session. Result: your AI assistant understands the state of your project without you having to burn tokens on context dumps.
This approach has no dependency on external APIs, keys, or cloud services. Unlike many “AI memory” tools that upload session logs or use cloud summarization with billable model calls, Recall’s design is fundamentally local-first. Your usage never exposes code, paths, or secrets outside your machine. You can browse the plugin’s code and see the process yourself at the official GitHub repo.
Key features:
Token waste compounds when every session restarts from zero. Recall breaks this cycle and makes your Claude Code subscription go much further. There are two main mechanisms driving these savings:
1. Local summarization spends zero model tokens.
Normally, creating a session summary via an LLM requires sending prompts (and context) to a model, eating up valuable quota or credits—even if the summary isn't needed for user-facing output. Recall circumvents this: your session log is condensed using a classical Python summarization algorithm that runs entirely on your local CPU. Not a single Claude API call, and zero token cost for building or updating the summary.
2. Resuming from a compact summary instead of re-explaining.
Session resumes are expensive when you have to re-feed your project’s goals, context, and progress with every new Claude Code start. Recall provides a compact summary (typically around 1,000–2,000 tokens) that encapsulates your session state: what you’re building, next steps, files edited, and open threads. Instead of hundreds or thousands of tokens per session wasted on recap, you spend only what’s needed to inject the summary—shrinking your overall usage dramatically.
If you’re on a metered Claude Code subscription or API, this translates to real savings. No more burning up your monthly usage limit just to keep the agent up to speed. Instead, you can get several times the mileage out of each credit because memory and recap work is shifted fully offline.
Takeaway: Recall turns session context from a recurring expense into a fixed, zero-cost feature. Your subscription covers just your actual prompts and completions—never redundant context setup.
Setting up Recall is as straightforward as it gets—zero config, no new accounts, no external dependencies. Here’s all it takes:
Step 1: Install the Recall plugin
Clone the project or download the release from the GitHub repo. Place the plugin files in your Claude Code local instance’s plugin directory.
git clone ~/.claude-code/plugins/recall
Step 2: Start Claude Code locally, as usual
Recall is purpose-built for local Claude Code environments you control. Launch your usual Claude Code session—no flags, no API keys, no extra config files.
Step 3: Automatic memory capture begins
The moment the plugin loads:
summary.txt
, condensing your progress and open threads..recall/
in your project directory:
.recall/log.txt # Append-only, high-fidelity log of every session.
.recall/summary.txt # Overwritten each run; compact, session-ready summary.
You’ll never need to manually save, trigger recaps, or curate memory. Recall simply works in the background.
Step 4: smooth session resumes
When you restart Claude Code and load your project:
summary.txt
into the AI context up front.Tip: To confirm Recall is working, open your project’s .recall
directory. You should see both log.txt
and summary.txt
updating as you interact. If the files don’t appear, ensure you’re running a supported local Claude Code build and have the plugin in the correct directory.
Troubleshooting common issues:
.recall
files: double-check the plugin’s placement and permissions.[[DIAGRAM: local Claude Code session with Recall plugin — every action is logged to disk, summary file is generated locally, and both are used to resume sessions without API calls. Nothing leaves the machine.]]
AI session memory is almost always a privacy risk. Most plugins and tools that promise "long-term context" or "durable AI memory" achieve it by up transcripts or workspace data to remote endpoints or cloud LLM providers. Every time you recap your project for the agent, your code, directory structure, and even secrets risk leaking upstream.
Recall is designed to block this risk by default. All memory—session transcripts, code, paths, commands, and even mistakes—is logged and summarized purely on your workstation. Nothing is ever sent to any API, cloud service, or model endpoint. The classical Python summarizer does its work without internet access, never involving a third-party LLM. Your project never leaves your control at any point in the workflow.
For sensitive repositories, client codebases, regulated industries, or anyone simply unwilling to ship their working context to a vendor, this is the only viable AI memory option. Recall’s privacy guarantee is as strong as your local disk permissions—no network, no export, no leak path.
Takeaway: If you need memory but can’t compromise on privacy, Recall is the right tool. Local-first by design, not just marketing.
Before you add Recall to every coding stack, double-check the target environment:
.recall/
in each project. Depending on project size and session history, large logs could grow, though daily use cases (dozens of sessions, multi-week projects) are unlikely to outpace modern storage.If you’re considering scaling this across many machines or projects: the privacy model relies on every machine staying strictly local. Network sync or backup is your responsibility.
Virtually every other persistent AI memory or session summarization tool takes one of two approaches:
| Feature | Recall | Typical AI Memory Plugin |
|---|---|---|
| Session memory offline? | Yes (entirely local) | No (cloud / API) |
| Cost per summary | $0 (classical summarizer) | Metered ($/token) |
| API key/Internet needed? | No | Usually |
| Privacy guarantee | Code stays on disk | Often uploads |
Recall’s niche is clear: if you want persistent, Affordable, frictionless project context—without the privacy and data exposure risks of cloud summarization—there is simply no competition for local Claude Code setups. Alternatives charge per token, bill monthly, or demand trust in a third-party to store your session logs. Recall never touches the internet, never asks for a key, and never requires an account.
It’s the right solution for any developer who values privacy, cost control, and direct ownership of their working context.
If you’re running Claude Code on your own hardware, the Recall plugin for Claude Code durable memory is the missing piece: frictionless, free, and private. Stop burning credits on repetitive context dumps and keep your sessions flowing, securely and efficiently—all while never compromising on privacy or cost. Download Recall and let your coding sessions finally pick up right where you left off—no more cold starts.
For next steps on optimizing your Claude Code setup, see “How to Optimize Token Usage in LLM-based Coding Assistants,” “Local AI Development Environments: Setup and Best Practices,” and “Privacy and Security Best Practices for AI Code Tools.”