{"slug": "claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp", "title": "Claude-tinderbox: Search your Claude.ai conversation history locally via MCP", "summary": "A developer created Tinderbox, a local search system for Claude.ai conversation archives that allows any Claude session to query past conversations via two MCP tools. The system ingests exported conversation ZIP files into a Supabase database with hybrid semantic and full-text retrieval, achieving 68.7% top-1 accuracy on a 150-query test set across 676 conversations and 10,653 messages.", "body_md": "A personal claude.ai conversation archive — schema, ingest, embeddings, hybrid retrieval, and an MCP server that lets any Claude session search your own past conversations.\n\nStatus:working end-to-end. Used daily by the author. Not packaged for general consumption — seeCaveatsbelow.\n\nYou export your conversations from claude.ai, drop the ZIP into a watched directory, and within ~15 minutes your full archive is searchable from any Claude session via two MCP tools:\n\n`tinderbox_search(query, limit=10)`\n\n— hybrid semantic + full-text retrieval over every message and artifact`tinderbox_get_conversation(export_id, max_messages=50)`\n\n— pull the full thread of any conversation surfaced by a search\n\nEverything is local-ish — you bring your own Supabase free-tier project, your own Ollama install for embeddings, and a Mac (this is targeted at Apple Silicon).\n\nThe current author's archive: 676 conversations, 10,653 messages, 172 artifacts, 10,731 mxbai-embed-large vectors. Hybrid retrieval hits 68.7% top-1 / 88.7% top-10 on a frozen 150-query QA set generated by Haiku from the corpus itself (re-runs weekly via launchd).\n\nTinderbox stores statements, not facts. Every retrieval response renders the provenance inline — never\n\n`\"X is true\"`\n\n, always`\"on [date], in [conversation], [participant] said [content]\"`\n\n. The corpus answerswhat was said when, by whom; neverwhat is true.\n\nThat's [design principle #1 from the schema doc](/luckyrmp/tinderbox-archive/blob/main/docs/STAGE_1_SCHEMA_PROPOSAL.md). Memorial archive, not extraction pipeline. Forward-linked when superseded, never backward-edited.\n\nFor context-window reasons it's also genuinely useful: a Claude session can look up its own past reasoning instead of re-deriving it.\n\nPostgres (Supabase) holds 12 tables under a `tinderbox`\n\nschema — schema versioning, ingest runs, conversations, messages, artifacts, attachments, embeddings (`vector(1024)`\n\n+ hnsw), enrichment, named_instances, query log, and a frozen QA test set. A Python parser stream-reads claude.ai export ZIPs and upserts everything idempotently. An embed worker batches messages and artifacts through Ollama (`mxbai-embed-large`\n\n, 1024-dim) and writes vectors back. A server-side Postgres function (`tinderbox.hybrid_search`\n\n) ranks results by `(1 - cosine_distance) + 0.5 * ts_rank_cd`\n\n. A small from-scratch JSON-RPC 2.0 MCP server exposes two tools over stdio. Three launchd daemons run the whole thing on a schedule: inbox watcher (15min), QA eval (Sundays 03:00), staleness alerter (daily 09:00 with cooldown + debounce).\n\n**macOS** with Apple Silicon, Python 3.14 (or 3.12+ probably — author runs 3.14.3_1)**Supabase free-tier project**— $0/month for this scale. Optional[$4/mo IPv4 add-on](https://supabase.com/pricing)for proper RLS scoping (stage 5b).** Ollama**running locally with`mxbai-embed-large`\n\npulled (`ollama pull mxbai-embed-large`\n\n)- A claude.ai data export ZIP (Settings → Account → Export Data)\n\n```\n# 1. Clone\ngit clone <this repo> ~/tinderbox && cd ~/tinderbox\n\n# 2. Create your Supabase project, get URL + service-role key\n# 3. Render config + plists for your $HOME / $USER\n./parser/scripts/setup.sh\n\n# 4. Create your env file (path is configurable via TINDERBOX_ENV_FILE)\ncp .env.example ~/.tinderbox.env\n# … and fill in SUPABASE_URL, SUPABASE_SERVICE_KEY, etc.\n\n# 5. Apply the migrations to your Supabase project\n# (each migration file is plain SQL — run them in order via the Supabase\n# SQL editor, or via psql, or via your tool of choice)\nls migrations/\n\n# 6. Pull the embedding model\nollama pull mxbai-embed-large\n\n# 7. Set up the venv (the project uses a .pth bridge to share deps from\n# other venvs on the author's machine; you'll likely want to install\n# fresh — see parser/pyproject.toml for the deps list)\npython3 -m venv parser/venv\nparser/venv/bin/pip install supabase python-dotenv click httpx pydantic anthropic\n\n# 8. Drop your export ZIP into the inbox and watch it ingest\nmkdir -p inbox\nmv ~/Downloads/data-*.zip inbox/\nparser/venv/bin/python -m tinderbox.cli scan-inbox\n\n# 9. Embed everything\nparser/venv/bin/python -m tinderbox.cli embed\n\n# 10. Try a search\nparser/venv/bin/python -m tinderbox.cli search \"your test query\"\n\n# 11. Wire to Claude Code / Desktop — see docs/MCP_INSTALL.md\n```\n\nActivate the daemons (optional but recommended):\n\n```\nlaunchctl load ~/Library/LaunchAgents/com.$USER.tinderbox.scan.plist\nlaunchctl load ~/Library/LaunchAgents/com.$USER.tinderbox.qa.plist\nlaunchctl load ~/Library/LaunchAgents/com.$USER.tinderbox.staleness.plist\n.\n├── migrations/                    # Numbered SQL — apply to your Supabase project in order\n├── parser/\n│   ├── tinderbox/                 # Python package\n│   │   ├── parser/                # ZIP → typed records (streaming JSON, content-block parsing, artifact versioning)\n│   │   ├── ingest/                # Records → DB (upsert, retry, tombstone sweep, mass-tombstone canary)\n│   │   ├── embed/                 # mxbai-embed-large via Ollama, batched, idempotent, per-row fallback\n│   │   ├── search/                # Hybrid retrieval + query logging\n│   │   ├── qa/                    # Frozen-query-set eval (Haiku-generated, scheduled)\n│   │   ├── mcp/                   # Minimal JSON-RPC 2.0 MCP server (no SDK dep)\n│   │   ├── staleness.py           # Daily check w/ cooldown + debounce\n│   │   ├── cli.py                 # tinderbox <command>\n│   │   └── ...\n│   ├── tests/\n│   ├── scripts/                   # Setup, MCP launcher, surgical recovery\n│   └── launchd/templates/         # Plist templates filled by setup.sh\n├── docs/\n│   ├── STAGE_1_SCHEMA_PROPOSAL.md       # Design principles + table-by-table rationale\n│   ├── STAGE_1_COMPLETION_REPORT.md     # Bugs found + fixed during ingest\n│   ├── STAGE_2_COMPLETION_REPORT.md     # Embed + hybrid retrieval shipped\n│   ├── STAGE_5_COMPLETION_REPORT.md     # MCP server + IPv4 add-on\n│   ├── ACCEPTED_ADVISORIES.md           # Supabase advisor findings + accepted/applied/deferred\n│   ├── MCP_INSTALL.md                   # Claude Code / Desktop config snippets\n│   └── STAGE_2_HANDOFF.md               # Inter-session handback (historical)\n└── README.md (this file)\n```\n\n**Hardcoded paths.** The author runs everything under`~/tinderbox/`\n\n.`parser/scripts/setup.sh`\n\nrenders the launchd plists for your`$HOME`\n\n/`$USER`\n\nbut the Python defaults still assume that root.`TINDERBOX_*`\n\nenv vars override every default — set them up in your env file.**No tests for end-to-end MCP from a real client.** The smoke test in`parser/tests/test_mcp_smoke.py`\n\nspawns the server as a subprocess and exchanges minimal protocol. Real validation is \"does Claude Code surface the tool\" (verified) and \"does the tool return useful results\" (eyeballed).The MCP server still authenticates via`service_role`\n\nbypass for stage-1.`service_role`\n\n(RLS bypassed). Documented in`docs/STAGE_5_COMPLETION_REPORT.md`\n\n— fine until you start differentiating privacy classes; then stage 5b auth swap to`tinderbox_owner`\n\ndirect connection becomes urgent.**No SQLite option.** Supabase only. The free tier easily handles this scale; if you want 100% local, you'll need to translate the schema and rewrite the DB layer (~5-6 hrs of work — author chose not to).**macOS only.** launchd schedules, paths, and the`.pth`\n\nvenv bridge are macOS conventions. Linux would need systemd units and a different venv approach. Not difficult, just not done.**Author's archive shape baked into a few decisions.** The 5 MB`RAW_CONTENT_BYTE_LIMIT`\n\nwas chosen because the author's largest message is 57 MB. The stratified sampler in`qa/sample.py`\n\nuses bucket sizes (long ≥20 msgs, short 3-10 msgs) tuned to the author's distribution. Both are easy to retune.\n\nPick one. The author is open to anything that lets people fork and adapt without obligation.\n\nBuilt collaboratively across many Claude sessions over a few days in April 2026. Schema design → parser → embed → search → QA → MCP server, mostly autonomous, with the human (Lucky) stepping in at architecture decisions and pushing back when something didn't smell right.\n\nThe system can now query the very conversations that built it.", "url": "https://wpnews.pro/news/claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp", "canonical_source": "https://github.com/luckyrmp/tinderbox-archive", "published_at": "2026-06-06 03:39:45+00:00", "updated_at": "2026-06-06 04:16:26.227607+00:00", "lang": "en", "topics": ["ai-tools", "large-language-models", "natural-language-processing", "ai-infrastructure", "generative-ai"], "entities": ["Claude.ai", "Supabase", "Ollama", "Haiku", "mxbai-embed-large", "launchd", "MCP", "Apple Silicon"], "alternates": {"html": "https://wpnews.pro/news/claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp", "markdown": "https://wpnews.pro/news/claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp.md", "text": "https://wpnews.pro/news/claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp.txt", "jsonld": "https://wpnews.pro/news/claude-tinderbox-search-your-claude-ai-conversation-history-locally-via-mcp.jsonld"}}