A local-first MCP server that grounds LLMs in your own files.
UCP indexes folders on your machine — notes, code, conversation exports — and exposes them to any MCP-compatible client (Claude Desktop, Cursor, LM Studio, and other local-agent runtimes) as a single tool: search_local_context
. Hybrid retrieval (BM25 + vector), tree-sitter-aware code chunking, full citations, content-hash embedding cache. Single binary. No telemetry. No cloud.
Paired with a local model in LM Studio (or Ollama via ucp-local ask
), the whole stack — indexing, embeddings, retrieval, and the chat model — runs fully offline. Works on a plane, in an air-gapped facility, or anywhere a cloud LLM isn't an option.
Conversation memory — make every past Claude chat searchable across every future session.
Air-gap RAG — local Ollama + local index, zero network traffic.
Quick start — install, index, ask, in under a minute.
| If you are… | UCP gives you… |
|---|---|
| A Claude / Cursor / LM Studio power user | |
A searchable archive of every past AI conversation, callable from any future session as the search_local_context tool. |
|
| A software engineer | |
| Code + private docs + sibling repos + past Claude chats unified under one MCP tool — surfaced inside Cursor or Claude Code alongside their native indexers. | |
| A researcher, writer, or academic | |
| A PDF + notes corpus you can ask grounded questions against, with line-level citations, without anything leaving the machine. | |
| In a privacy-regulated workflow (legal, medical, defense, NDA-bound IP) | |
| A single Rust binary with zero telemetry and zero cloud. Pair with LM Studio for a fully offline, end-to-end RAG stack. | |
| A solo founder or consultant | |
Per-folder client isolation via folder_filter — no risk of leaking client A's context into client B's session. |
Full audience analysis, competitive comparison, and the two wedges UCP is explicitly built to win on: see POSITIONING.md.
v0.1, headless. Track scope in ROADMAP.md.
What ships:
- Hybrid search: SQLite FTS5 (BM25) ⨉
sqlite-vec
(ANN) merged via reciprocal-rank fusion. - Tree-sitter chunking for Rust, Python, TypeScript/JavaScript. Heading-aware Markdown. Sentence-bounded prose fallback.
- Conversation memory: ingest your Claude
conversations.json
export and search across past chats. - PII masking on by default — email, OpenAI
sk-
, AWS keys, GitHub PATs, JWT. - Content-hash embedding cache: re-indexing unchanged content makes zero Ollama calls.
- Filesystem watcher: edit a file, the index updates in ~500ms.
What's not in v0.1:
- Desktop UI / tray (deferred — was in original spec, now in ROADMAP tier 2+).
- OS hotkey injector and HTTP proxy interceptor (cut from the original spec).
- OpenAI / Anthropic embedding providers (Ollama only for now).
- Cursor and ChatGPT export formats (Claude only; others later).
UCP needs three things on your machine: Rust (to build), Ollama (to embed and optionally chat), and Poppler (for robust PDF text extraction — recommended).
brew install ollama poppler
ollama serve & # or use the menu-bar app
ollama pull nomic-embed-text
sudo apt install poppler-utils
curl -fsSL https://ollama.com/install.sh | sh
ollama pull nomic-embed-text
sudo dnf install poppler-utils
curl -fsSL https://ollama.com/install.sh | sh
ollama pull nomic-embed-text
choco install poppler ollama # or install each manually
ollama pull nomic-embed-text
Rust (stable, edition 2024) is needed only to build from source. If you install a pre-built UCP binary, skip the Rust install.
Poppler is optional but recommended.Without it, UCP only uses the bundledpdf-extract
for PDFs, which struggles with PDFs whose body fonts lack a ToUnicode CMap (you'll see headings extract but body text go missing). Withpdftotext
from Poppler on PATH, UCP falls back to it automatically.
Note on the name.The crate is published ason crates.io — the bareucp-local
ucp
name was taken. The binary on yourPATH
is alsoucp-local
(that's what you type on the command line), and the library is imported asuse ucp_local::...
.
cargo install ucp-local
git clone <repo-url> ucp-local
cd ucp-local
cargo build --release
cargo install --path . # optional, to put `ucp-local` on your PATH
ucp-local index ~/Documents/notes
ucp-local index ~/Documents/notes ~/code/my-project ~/research
ucp-local watch ~/code/my-project
ucp-local clear
ucp-local clear ~/Documents/notes
ucp-local clear --hard --yes
ucp-local ingest-conversations ~/Downloads/claude-export/conversations.json
ucp-local status
ucp-local serve
ucp-local search "your query here"
ucp-local search "rate limiting" --folder ~/code/my-project --limit 10
ucp-local ask "what does the rate limiter do when a token bucket runs out?"
ucp-local ask "summarize my Q3 plan" --model qwen2.5
UCP speaks MCP over stdio, so any client that launches MCP servers can use it. Same serve
command, different config file per client.
Add to ~/Library/Application Support/Claude/claude_desktop_config.json
on macOS (%APPDATA%\Claude\claude_desktop_config.json
on Windows):
{
"mcpServers": {
"ucp-local": {
"command": "/full/path/to/ucp-local",
"args": ["serve"]
}
}
}
Restart Claude Desktop. The search_local_context
tool will be available — ask something grounded in your indexed files and it'll cite them inline.
Cursor reads MCP servers from ~/.cursor/mcp.json
(or per-project .cursor/mcp.json
):
{
"mcpServers": {
"ucp-local": {
"command": "/full/path/to/ucp-local",
"args": ["serve"]
}
}
}
Reload Cursor. The chat sidebar will surface search_local_context
as a tool — useful for grounding the agent in repos and docs Cursor's own @codebase
indexer can't reach (private notes, conversation history, sibling repos).
LM Studio 0.3.17+ supports MCP. Open the chat settings, find the MCP servers section, and add:
{
"mcpServers": {
"ucp-local": {
"command": "/full/path/to/ucp-local",
"args": ["serve"]
}
}
}
Pair UCP with any local model you've downloaded in LM Studio (Llama, Qwen, Mistral, etc.). Now your indexing, embeddings, retrieval, and chat model all run on the same machine — no cloud, no network — and the LLM can still call search_local_context
to ground its answers in your files.
Any client following the MCP spec (Zed, Continue.dev, Goose, custom Agent SDK apps, etc.) takes the same command
args
shape. If your client expects a JSON-RPC stdio server, point it at ucp-local serve
and you're done.
~/.config/ucp/config.toml
(or the platform equivalent — ucp-local status
prints the resolved path). All fields optional; defaults shown:
[ollama]
host = "http://localhost:11434"
embedding_model = "nomic-embed-text"
[chunking]
max_tokens = 512
overlap_sentences = 1
By extension: md
, markdown
, txt
, rs
, py
, ts
, tsx
, js
, jsx
, mjs
, go
, pdf
.
PDFs:text is extracted viapdf-extract
and chunked as prose. Works well for digitally generated PDFs (papers, docs, exported notes). Falls down on scanned image-only PDFs — those need OCR (v0.2+). Citation line numbers reference the extracted plaintext, not PDF page numbers; page-aware citations are on the v0.2 list.
Skipped directories: .git
, .idea
, .vscode
, target
, node_modules
, __pycache__
, .venv
, venv
, dist
, build
, .next
, .nuxt
, coverage
, .pytest_cache
, .mypy_cache
. Dotfiles are skipped.
| Module | Role |
|---|---|
ingestion |
|
| Masking + per-format chunkers (prose / markdown / code via tree-sitter) + dispatcher | |
storage |
|
rusqlite + sqlite-vec + FTS5; hybrid search via RRF |
|
embeddings |
|
OllamaClient + content-hash cache via EmbeddingCache::hash |
|
indexer |
|
| Walk + read + chunk + embed + insert; single-file and bulk-chunk paths | |
watcher |
|
notify -based debounced re-index |
|
mcp |
|
JSON-RPC 2.0 stdio server, one tool: search_local_context |
See CLAUDE.md for the developer-facing architecture summary, and Universal Context Pipeline Specification.md for the original (now narrower in scope) design doc.
cargo test # full test suite
cargo test --lib ingestion # one module
cargo run -- index <path> # iterate against the dev build
RUST_LOG=ucp_local=info cargo run -- watch <path> # verbose
Release history and notes live in CHANGELOG.md. The current published version is 0.1.0 (crates.io).
Under Apache-2.0.