A self-hosted, two-tier Graphiti knowledge graph that MCP clients (for example Claude Code and Pi) read from and write to over a private Tailscale network. It's offline-first: by default every part β including the LLM that extracts your graph β runs on your own hardware, so nothing leaves the box.
It runs on a single always-on Linux host with Docker and a consumer NVIDIA GPU. Your laptops and other devices are pure clients β they host nothing.
Knowledge-graph ingestion uses an LLM to extract entities and relationships from text. That
extraction is where your data would be exposed to a model β so by default commonplace
does it locally, on your GPU, for both tiers. The two tiers split memory by confidentiality and by whether you're allowed to trade locality for quality:
| Tier | Graph | Extraction (default) | Where it runs | Use for |
|---|---|---|---|---|
| personal | ||||
commonplace_personal |
||||
mistral:7b-instruct-q4_0 (local) |
||||
| the host's GPU | your own notes, projects, life β optionally a hosted model for quality | |||
| client-confidential | ||||
commonplace_client |
||||
mistral:7b-instruct-q4_0 (local) |
||||
| the host's GPU | confidential / NDA material that must never leave the machine |
The personal tier is local by default but may be pointed at a hosted model (e.g. Claude Haiku)
for higher-quality graphs on non-confidential data β opt in via .env
(see Hosted upgrade? under Setup). The client tier is always local; that's the whole point of it.
Retrieval is cheap and private on both tiers. Search is embeddings + BM25 + graph traversal with no LLM in the query path. The GPU only ever does slow, asynchronous background extraction β query latency is never affected. Slow local extraction is therefore fine.
Both tiers share one embedder (Ollama nomic-embed-text
, 768-dim) and one FalkorDB holding two separate graphs, so the two memories stay isolated but the infrastructure stays simple.
flowchart TB
CC["Claude Code<br/>(client)"]
PI["Pi<br/>(client)"]
TS{{"Tailscale<br/>MagicDNS Β· tailnet-only"}}
ANT["Anthropic API<br/>Claude Haiku 4.5 Β· hosted"]
CC --> TS
PI --> TS
subgraph HOST["your server β Docker"]
direction TB
GW["<b>gateway</b> :8000 / :8001<br/>per-tier auth Β· logging Β· metrics"]
MP["<b>mcp-personal</b><br/>personal tier Β· internal"]
MC["<b>mcp-client</b><br/>client-confidential Β· internal"]
GW --> MP
GW --> MC
OL["<b>Ollama</b> :11434<br/>nomic-embed-text Β· mistral:7b<br/>local GPU"]
subgraph FALKOR["FalkorDB :6379 Β· browser UI :3000"]
direction LR
GP[("commonplace_personal")]
GC[("commonplace_client")]
end
MP -->|store| GP
MC -->|store| GC
MP -. embed .-> OL
MC -. embed .-> OL
MC -->|extract Β· local| OL
end
TS -->|Bearer token| GW
MP -->|extract Β· hosted| ANT
classDef ext fill:#fff3e0,stroke:#e67e22,color:#111;
classDef tier fill:#e8f0fe,stroke:#4285f4,color:#111;
class ANT ext
class MP,MC tier
One FalkorDB, two graphs selected per-instance byFALKORDB_DATABASE
(commonplace_personal
vscommonplace_client
).Two Graphiti MCP instances(commonplace-mcp:local
, built fromzepai/knowledge-graph-mcp:standalone
β seeDockerfile
), HTTP transport, served at path(trailing slash)./mcp/
One shared Ollama embedder(nomic-embed-text
, 768-dim) used bybothinstances. Do not mix embedders β vectors from different embedders are not comparable.A gateway(Caddy) fronts both tiers: it owns the host ports, requires a** per-tier bearer token**(so a client with only the client token can't reach the personal tier), and emits access logs (audit) + Prometheus metrics. The MCP containers themselves are internal-only.
Replace
your-server.your-tailnet.ts.net
with your host's Tailscale MagicDNS name throughout (runtailscale status
on the host to find it).
| Tier | Host endpoint (tailnet) | Internal port | Graph (FALKORDB_DATABASE ) |
LLM | SEMAPHORE_LIMIT |
|---|---|---|---|---|---|
| personal | http://your-server.your-tailnet.ts.net:8000/mcp/ |
8000 | commonplace_personal |
mistral:7b⦠(local, default) |
1 |
| client | http://your-server.your-tailnet.ts.net:8001/mcp/ |
8000 | commonplace_client |
mistral:7b-instruct-q4_0 |
1 |
| FalkorDB | 127.0.0.1:6379 (host-local only) |
6379 | both graphs | β | β |
| FalkorDB UI | http://your-server.your-tailnet.ts.net:3000 |
3000 | browse either graph | β | β |
| Metrics | 127.0.0.1:9180/metrics (host-local only) |
9180 | gateway (Prometheus) | β | β |
The personal/client endpoints require Authorization: Bearer <tier-token>
(set PERSONAL_TOKEN
/
CLIENT_TOKEN
in .env
). A request without the right token gets 401
.
On the host:
Docker with Compose v2.running on the host, serving the shared embedder and the local extraction model. The MCP containers reach it over HTTP β the GPU is used by Ollama, not by the containers, so no GPU passthrough into Docker is required. A consumer NVIDIA GPU with ~8 GB VRAM runsOllamamistral:7b-instruct-q4_0
comfortably; CPU-only works but local extraction is slow.β the MCP endpoints are served over the tailnet, not the public internet.TailscaleNo API keys required. Both tiers extract locally by default. AnAnthropic API key is neededonlyif you opt the personal tier into a hosted model (see*Hosted upgrade?*below).
On each client (laptop, etc.): Tailscale, plus an MCP-capable client (Claude Code, Pi, β¦).
Run on the host, from a clone of this repo (e.g. ~/commonplace
):
ollama pull nomic-embed-text
ollama pull mistral:7b-instruct-q4_0
cp .env.example .env
docker compose up -d
docker compose ps # all services should report healthy
Then point a client at the two endpoints β see Client configuration.
Hosted upgrade?Everything is local by default. To point thepersonaltier at a hosted model for higher-quality graphs (non-confidential data only), set in.env
:PERSONAL_LLM_PROVIDER=anthropic
,PERSONAL_LLM_MODEL=claude-haiku-4-5
,PERSONAL_SEMAPHORE_LIMIT=5
, andANTHROPIC_API_KEY=β¦
. The client tier stays local regardless.
Upgrading from a pre-gateway deploy?AddPERSONAL_TOKEN
/CLIENT_TOKEN
to.env
, thendocker compose up -d --build --force-recreate
(the MCP tiers move behind the gateway and the ontology change needs a recreate). Re-add each client with itsAuthorization: Bearer
header β existing token-less clients will start getting401
.
These are the landmines specific to the current (2026) Graphiti MCP server. Several contradict older docs.
There is no To use Ollama you setopenai_generic
provider string.provider: "openai"
and pointapi_url
at a non-OpenAI URL; the server then auto-selects itsOpenAIGenericClient
internally. That generic client is what avoids OpenAI's betaresponses.parse()
(which Ollama does not implement). Settingprovider: "openai_generic"
is invalid.There is no The MCP server has a singlesmall_model
setting.llm.model
. On the openai path it uses that same model for the "small" slot too. The infamousgpt-4.1-mini
is only a fallback used whenmodel
isNone
β pinningllm.model
is enough to never hit it., andjson_schema
structured output is always on for the local path and cannot be disabledβ retries are built-in (tenacity, 4 attempts). There is no config knob for either. If a small local model produces invalid JSON, the only lever is a more capable model.instructor
is not used thereOllama must be reachable from inside the containers. Ollama runs on thehost, so each MCP service needsextra_hosts: ["host.docker.internal:host-gateway"]
and anapi_url
ofhttp://host.docker.internal:11434/v1
. Ollama must listen on0.0.0.0:11434
(it does by default).Two graphs in one FalkorDB = two instances with the sameFALKORDB_DATABASE
selects the graph;group_id
does not.FALKORDB_URI
and differentFALKORDB_DATABASE
.group_id
only namespaces nodeswithina graph.FalkorDB host/port are parsed fromβFALKORDB_URI
FALKORDB_HOST
/FALKORDB_PORT
are ignored. The only env overrides read areFALKORDB_URI
andFALKORDB_PASSWORD
.FalkorDB password is set via, an env var βREDIS_ARGS=--requirepass β¦
notby overriding the containercommand
(that would stop the FalkorDB module from ).Use the:standalone
image, not:latest
.zepai/knowledge-graph-mcp:latest
bundles its own FalkorDB;:standalone
expects an external one β required to share a single FalkorDB across two instances.The MCP path has a trailing slash:(FastMCP default; not configurable)./mcp/
Anthropic model id: use the bare alias Theclaude-haiku-4-5
, notclaude-haiku-4-5-latest
.-latest
suffix is an OpenAI-ism; the Anthropic API 404s on it (not_found_error: model
). The bare alias resolves to the current dated snapshot (claude-haiku-4-5-20251001
).The Anthropic provider needs an explicit numeric graphiti passesllm.temperature
.temperature=config.temperature
; with none set it sendsnull
and the API 400s (temperature: Input should be a valid number
), so every personal-tier episode queues but never processes. The OpenAI/Ollama generic client toleratesnull
, so this bites only the Anthropic tier. Set e.g.temperature: 0.0
.The:standalone
image ships WITHOUT theanthropic
SDK.provider: anthropic
then fails at startup β "Anthropic client not available in current graphiti-core version" (the factory'sHAS_ANTHROPIC
is False becauseimport anthropic
raises). The bundledDockerfile
adds it (uv pip install anthropic
).graphiti-core builds a default OpenAI reranker at init that demandsOPENAI_API_KEY
even though the search path usesNODE_HYBRID_SEARCH_RRF
(no cross-encoder). Give each tier a dummyOPENAI_API_KEY
so it can construct; pointOPENAI_BASE_URL
at Ollama so even an accidental call stays on-box. In practice it is never called.FastMCP rejects non-localhost Host headers with HTTP 421 "Invalid Host header". It auto-enables DNS-rebinding protection with a localhost-only allow-list at construction and passes that object explicitly into its pydantic Settings, so theFASTMCP_β¦
env vars cannot override it (init kwargs beat env). The bundledpatch_transport_security.py
(run in the Dockerfile) disables the protection β safe on a tailnet, where the network is the trust boundary and clients are agents, not browsers. To tighten, set explicitallowed_hosts
instead.The container env var for the OpenAI-compatible base URL is(graphiti's config expansion), notOPENAI_API_URL
OPENAI_BASE_URL
. Note the reranker (#13) is the opposite β it reads the OpenAI SDK'sOPENAI_BASE_URL
. Two different names for two different clients.
Run on the host, from the repo directory (e.g. ~/commonplace
).
Redeploy in one command β scripts/commonplace
wraps the pull β rebuild β recreate flow
(symlink it onto your PATH
, e.g. ln -sf "$PWD/scripts/commonplace" ~/.local/bin/commonplace
):
commonplace update # sync repo, rebuild image, recreate config-sensitive services
commonplace update --reset # same, but hard-reset to origin/main (after a force-push)
commonplace status # service health + graph counts
The underlying compose commands, if you'd rather run them by hand:
docker compose up -d
docker compose ps
docker compose logs -f mcp-personal # or mcp-client, falkordb
docker compose up -d --force-recreate mcp-client
docker compose up -d --build
docker compose stop
docker compose start
docker compose down
docker compose down -v
Quick MCP health check (from a client, over the tailnet or LAN). Without a token you get 401
(auth working); with the right tier token you get 307
:
curl -s -o /dev/null -w "%{http_code}\n" -H "Authorization: Bearer $PERSONAL_TOKEN" \
http://your-server.your-tailnet.ts.net:8000/mcp/
curl -s -o /dev/null -w "%{http_code}\n" -H "Authorization: Bearer $CLIENT_TOKEN" \
http://your-server.your-tailnet.ts.net:8001/mcp/
./scripts/graph_stats.sh # writes landing per tier
./scripts/mcp_activity.sh # reads/writes per tier from the gateway log
FalkorDB persists to the falkordb_data
volume β mounted at its actual data dir
(/var/lib/falkordb/data
), with AOF enabled (--appendonly yes
), so writes are durable to ~1s and survive container recreates. Back up / restore the whole data dir (RDB + AOF) with the scripts:
./scripts/backup.sh # -> ./backups/falkordb-<stamp>.tar.gz
./scripts/restore.sh ./backups/falkordb-<stamp>.tar.gz # overwrites live data (prompts to confirm)
Both read FALKORDB_PASSWORD
from .env
. backup.sh
asks the server for its data dir, so it keeps working even if the path changes.
Earlier revisions mounted the volume at
/data
while FalkorDB wrote to/var/lib/falkordb/data
on the ephemeral container layer β so data was lost on every--force-recreate
. The mount path is now fixed; redeploy withcommonplace update
to apply it.
Default: MagicDNS + port. The gateway binds:8000
/:8001
on the host and is reached over the tailnet athttp://your-server.your-tailnet.ts.net:8000/mcp/
and:8001/mcp/
. This is tailnet-reachable (and LAN-reachable) butnot public β do not port-forward these on your router.Auth. Every request needsAuthorization: Bearer <tier-token>
; the gateway 401s otherwise. SeparatePERSONAL_TOKEN
/CLIENT_TOKEN
give each client only the tiers it should touch.FalkorDB(host-local) β never on the tailnet.:6379
and metrics:9180
bind to127.0.0.1
onlyKeep the host single-homed. The host's primary interface should hold exactly one IPv4. If a second address appears (e.g. a static IPplusa DHCP lease), Tailscale can advertise two WireGuard endpoints and the tunnel flaps, whichblack-holes TCP over MagicDNS while the LAN and(disco pings roam across endpoints; real TCP does not). On Ubuntu this most often comes from cloud-init re-enabling DHCP β disable its network management (tailscale ping
still appear to workecho 'network: {config: disabled}' | sudo tee /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
). Symptom to watch for:ip -brief addr show <iface>
listing more than one address on your LAN subnet.HTTPS upgrade (optional). To serve the MCP endpoints as tailnet-only HTTPS names instead of raw ports:then point clients at
tailscale serve --bg --https=8443 http://localhost:8000 # personal
tailscale serve --bg --https=8444 http://localhost:8001 # client
https://your-server.your-tailnet.ts.net:8443/mcp/
etc. MagicDNS:port is the simpler default and is what the client config below uses.
Replace
your-server.your-tailnet.ts.net
with your host's Tailscale MagicDNS name (tailscale status
). The identical ports/paths are also served on the host's LAN IP, which is a handy fallback if MagicDNS is ever unreachable.
Pass the per-tier bearer token with --header
. Give a client only the tiers it should reach (e.g. omit the personal server on a machine that handles confidential work):
claude mcp add --scope user --transport http commonplace-personal http://your-server.your-tailnet.ts.net:8000/mcp/ \
--header "Authorization: Bearer $PERSONAL_TOKEN"
claude mcp add --scope user --transport http commonplace-client http://your-server.your-tailnet.ts.net:8001/mcp/ \
--header "Authorization: Bearer $CLIENT_TOKEN"
claude mcp list # both should report β Connected
(New servers load on the next Claude Code start.)
Pi has no native MCP β add the community bridge, then a global mcp.json
:
pi install npm:@spences10/pi-mcp # records the bridge in settings.json
Each server entry must include "type": "http"
; a url
-only entry triggers an OAuth handshake
this server doesn't support. The extension lazy-connects by default β set
MY_PI_MCP_EAGER_CONNECT=1
to connect and discover tools at startup.
{
"mcpServers": {
"commonplace-personal": {
"type": "http",
"url": "http://your-server.your-tailnet.ts.net:8000/mcp/",
"headers": { "Authorization": "Bearer YOUR_PERSONAL_TOKEN" }
},
"commonplace-client": {
"type": "http",
"url": "http://your-server.your-tailnet.ts.net:8001/mcp/",
"headers": { "Authorization": "Bearer YOUR_CLIENT_TOKEN" }
}
}
}
Any device on the tailnet can use the same two endpoints β there is nothing per-client on the server. To add one:
- Join the device to the tailnet (
tailscale up
) and confirm it can reach the host (tailscale ping your-server
). - For Claude Code, run the two
claude mcp add β¦ /mcp/
commands above (user scope). - For any MCP client, add both servers with
"type": "http"
pointing at:8000/mcp/
and:8001/mcp/
. - Nothing to change on the host β graphs and auth are shared; reads/writes from the new client land in the same two graphs.
- For HTTPS, expose via
tailscale serve
(above) and use thehttps://β¦
URLs instead.
Two things turn this from a memory store into a memory system agents use well:
Per-tier ontology. Each tier definesgraphiti.entity_types
in its config (personal: Preference, Project, Person, Decision, β¦; client: Engagement, Stakeholder, Requirement, Risk, β¦). These type descriptions constrain extraction β the single biggest lever on graph quality, and they help the weak local model the most.An agent protocol. is the contract for any client (Claude Code, Pi): search before answering, write durable facts,docs/memory-protocol.md
never cross tiers(no confidential data on the hosted personal tier), and cite what you used. Install it as a skill or system prompt β without it, agents rarely call memory and the graph stays empty.
Is it actually being used? scripts/graph_stats.sh
shows whether writes are landing;
scripts/mcp_activity.sh
(and the Prometheus endpoint on :9180
) show whether agents are reading.
Seed an existing corpus with scripts/ingest_markdown.py
, pull token-budgeted context with
scripts/recall.py
, gate retrieval quality with eval/run_eval.py
, and review resolved
contradictions with scripts/contradictions.sh
. See docs/ROADMAP.md for what's shipped vs. still open (a local reranker remains the notable deferral).
commonplace/
βββ docker-compose.yml # FalkorDB + 2 MCP instances + gateway, restart: unless-stopped
βββ Dockerfile # commonplace-mcp:local β standalone image (digest-pinned) + patch
βββ patch_transport_security.py # build-time: allow remote Host headers (disable DNS-rebind guard)
βββ gateway/
β βββ Caddyfile # per-tier bearer auth + access logging + Prometheus metrics
βββ config/
β βββ personal.yaml # instance A β Anthropic Haiku extraction + personal ontology
β βββ client.yaml # instance B β local Ollama extraction + confidential ontology
βββ scripts/
β βββ commonplace # operate CLI: `commonplace update` redeploys the stack
β βββ graph_stats.sh # write counts per tier Β· mcp_activity.sh # read counts (gateway log)
β βββ recall.py # token-budgeted recall Β· contradictions.sh # superseded facts
β βββ backup.sh / restore.sh # FalkorDB dump + restore
β βββ ingest_markdown.py # load a markdown corpus (notes/docs) into a tier
βββ eval/
β βββ queries.yaml # retrieval eval cases (question β expected facts)
β βββ run_eval.py # scores recall against a tier
βββ docs/
β βββ memory-protocol.md # how agents should read/write memory (tier safety, cite-back)
β βββ ROADMAP.md # hardening & maturity plan
βββ .env.example # template; copy to .env on the host (gitignored)
βββ .dockerignore # keeps .env and other secrets out of the build context
βββ CLAUDE.md # guidance for Claude Code working in this repo
βββ LICENSE # MIT
βββ README.md
Secrets live only in .env
on the host and are never committed. The repo is the source of
truth: edit a clone, push to your fork, git pull
on the host, docker compose up -d
.
MIT.