Out of Stealth (Kinda) Egoist Machines, Inc. launched LodeDB, an open-source embedded vector database for local retrieval-augmented generation (RAG) that runs the same on-disk index on GPU when available, achieving up to 50,000 queries per second on an L40S. The in-process, on-disk database requires no server, API key, or telemetry of raw data, and supports incremental persistence and compact 2/4-bit vector storage. A fast, exact embedded vector database for local RAG: in-process, on-disk, no server. Built by Egoist Machines, Inc. - efficient full-stack infrastructure for reliable AI systems. Most embedded vector databases stop at the CPU. LodeDB runs the same on-disk index on the GPU when you have one: batched search hits 24k queries/sec on an A10 and 50k qps on an L40S , 2.8× to 4.8× the all-CPU ceiling, with recall unchanged. It also persists changed rows incrementally, so a commit stays sub-millisecond even at 1M vectors . Fast on a laptop. Faster on a GPU. Exact every time. Never phones home. GPU-resident batch search : an fp16 copy of the index lives on the GPU, scored with a tiled GEMM plus a streaming top-k gpu , Linux/CUDA . How it works gpu-resident-index . O changed persistence : commits only the rows that changed, 173× to 1,308× faster than a full rewrite. How it works delta-persistence . Compact storage : the MIT TurboVec turbovec core packs vectors into 2/4-bit codes and scans them with SIMD CPU kernels. In-process, on-disk .tvim / .tvd / .jsd : no daemon, no account, no API key. Private by default : text, ids, and vectors stay local; telemetry is metrics-only counts, bytes, latency , never raw payloads. Local embeddings : sentence-transformers on CUDA, MPS, or CPU. Batteries included : a lodedb CLI, a loopback dev server, an MCP server, and a LangChain VectorStore adapter. 🏢 EnterpriseThe LodeDB core is Apache-2.0 and free to use. Enterprise licensing is available for commercial support, managed and at-scale serving, and on-prem / BYOC deployment. Contact sales@egoistmachines.com . pip install lodedb That's it. Prebuilt wheels cover Linux, macOS Apple Silicon and Intel , and Windows on Python 3.11+, and bundle the TurboVec Rust core, so there's nothing to compile. Confirm the install with lodedb doctor . Optional extras: pip install "lodedb gpu " GPU-resident scan Linux/CUDA pip install "lodedb mcp,langchain " MCP server + LangChain adapter Build from source contributors, or a platform without a wheel Needs a Rust toolchain and a CBLAS provider Accelerate on macOS, libopenblas-dev on Linux . uv https://docs.astral.sh/uv/ builds and bundles the core for you: git clone https://github.com/Egoist-Machines/LodeDB && cd LodeDB uv sync builds + bundles the TurboVec core via maturin uv sync --extra mcp --extra langchain + MCP server, LangChain adapter uv sync --extra gpu + GPU-resident scan Linux/CUDA Run with uv run e.g. uv run lodedb doctor . python from lodedb import LodeDB db = LodeDB path="./data", model="minilm" "minilm" fast | "bge" quality fox = db.add "the quick brown fox jumps", metadata={"topic": "animals"} db.add "a lazy dog sleeps all day", metadata={"topic": "animals"} for score, doc id, meta in db.search "fox", k=5 : print score, doc id, meta for hits in db.search many "fox", "dog" , k=5 : batched; the GPU can serve this print h.score, h.id, h.metadata for h in hits db.get fox - "the quick brown fox jumps" text retained by default db.persist durable .tvim/.tvd/.jsd snapshot; replays on reopen Reopen with LodeDB path="./data" ; no migration step. Original text is kept in a .tvtext sidecar for db.get ; pass store text=False to keep none. Presets are minilm 384-dim and bge 768-dim , with weights pulled from Hugging Face on first use. More in examples/ /Egoist-Machines/LodeDB/blob/main/examples . With the gpu extra on a CUDA host, LodeDB reconstructs the compact index into an fp16 matrix resident on the GPU and scores batched search many with a tiled GEMM plus a streaming top-k. It is opt-in and lazy: single queries, non-CUDA hosts, and GPU-memory rejection fall back to the CPU scan, which stays the source of truth. GPU throughput climbs with batch size while the CPU scan is flat. Same 4-bit index d=1536, 100K , same host, only the scoring step differs. Crossover is around batch 50: | query batch | A10 GPU | L40S GPU | |---|---|---| | 1 | 261 q/s | 432 q/s | | 16 | 3,531 | 5,562 | | 64 | 11,463 | 18,175 | | 256 | 19,998 | 39,449 | | 1024 | 24,037 | 50,326 | Vanilla TurboVec CPU all threads on the same boxes: 8,497 q/s A10 host , 10,420 q/s L40S host . At batch 1024 the GPU is 2.8× / 4.8× that, and it scales with GPU class. Recall is unchanged: the GPU scores the exact 4-bit reconstruction, so R@1 tracks the CPU scan across datasets and bit-widths, and edges ahead on GloVe-200 where quantization error is largest. Other in-process vector databases stay CPU-bound. Alibaba's zvec https://github.com/alibaba/zvec reports about 8.4k q/s VectorDBBench, 16-vCPU CPU, Cohere 768-dim : the same class as the TurboVec CPU scan, and a different regime from ours, so read it as the CPU-class baseline. The GPU-resident path is what clears it. Scope. GPU search is Linux/CUDA-only and opt-in gpu . macOS scans on the CPU the MPS scan is experimental . See docs/benchmarks.md /Egoist-Machines/LodeDB/blob/main/docs/benchmarks.md and docs/architecture.md /Egoist-Machines/LodeDB/blob/main/docs/architecture.md . Most embedded indexes rewrite the whole file on every change O N . LodeDB writes only the rows that changed O changed , so a 1,000-row commit stays sub-millisecond at any size: | corpus | full rewrite | delta export | speedup | |---|---|---|---| | 100K | 42.4 ms | 0.25 ms | 173× | | 500K | 190.4 ms | 0.24 ms | 782× | | 1M | 404.9 ms | 0.31 ms | 1,308× | The GPU path makes reads fast; the delta makes writes cheap. The on-disk format stays a plain snapshot that replays on reopen. All artifacts are metrics-only counts, bytes, latency , never payloads. Full methodology and the complete figure set are in docs/benchmarks.md /Egoist-Machines/LodeDB/blob/main/docs/benchmarks.md ; each benchmarks/ /Egoist-Machines/LodeDB/blob/main/benchmarks folder has a README and a one-line reproduction command. Local is the common case. On an Apple M1 MiniLM, 20K docs the CPU scan is ~0.25 ms p50, and end-to-end single-query latency is 5.7 ms p50. lodedb doctor capability report: embedding / GPU / TurboVec backend lodedb index ... build / add to an on-disk index lodedb query ... search lodedb serve loopback dev server 127.0.0.1, no auth lodedb mcp stdio MCP server for agent memory lodedb benchmark local, metrics-only benchmark Exact scan, no ANN. Built for small-to-mid corpora where exact recall matters, not billion-scale. GPU is Linux/CUDA-only and opt-in gpu . macOS scans on the CPU; the MPS scan is experimental and was slower than NEON on the hardware tested. Single queries run on the CPU ; the GPU serves batched search many . Model weights download from Hugging Face on first use, then cache locally. The compact core is the upstream MIT TurboVec https://github.com/RyanCodrai/turbovec project © Ryan Codrai , vendored under third party/turbovec/ /Egoist-Machines/LodeDB/blob/main/third party/turbovec with its license preserved. LodeDB's lifecycle patches encoded-row export/import, upsert with ids , calibration are Apache-2.0. See . /Egoist-Machines/LodeDB/blob/main/NOTICE NOTICE Apache-2.0 LICENSE /Egoist-Machines/LodeDB/blob/main/LICENSE . The bundled TurboVec core is MIT , /Egoist-Machines/LodeDB/blob/main/NOTICE NOTICE . "LodeDB" and " /Egoist-Machines/LodeDB/blob/main/third party/turbovec/LICENSE third party/turbovec/LICENSE Egoist Machines https://egoistmachines.com " are trademarks; Apache-2.0 grants no trademark rights §6 . Enterprise licensing and commercial support are available from Egoist Machines, Inc. https://egoistmachines.com : contact sales@egoistmachines.com mailto:sales@egoistmachines.com . PRs welcome; see CONTRIBUTING.md /Egoist-Machines/LodeDB/blob/main/CONTRIBUTING.md . Report security issues privately per , not in public issues. Other bugs and requests go to the /Egoist-Machines/LodeDB/blob/main/SECURITY.md SECURITY.md issue tracker https://github.com/Egoist-Machines/LodeDB/issues .