Local RAG CLI to chat with any folder of documents using Ollama.
cd ~/ragit
python3 -m pip install -e .
If your default Python is 3.14+, use Python 3.10–3.13 (recommended: 3.12) because some vector DB dependencies may not publish wheels for very new Python versions yet.
Make sure Ollama is installed and running:
ollama pull nomic-embed-text
ollama serve
Index a folder:
ragit index ./docs
Start chat:
ragit chat ./docs
List available Ollama models:
ragit models
Clear an index:
ragit clear ./docs
ragit
implements Retrieval-Augmented Generation (RAG):
- It loads supported documents (
.txt
,.md
,.pdf
,.docx
) recursively. - It splits text into overlapping chunks (about 500 words with 50-word overlap).
- It creates embeddings using Ollama (
nomic-embed-text
) and stores vectors in local ChromaDB at~/.ragit/<hash_of_path>/
. - During chat, it embeds each query, retrieves the top relevant chunks, and injects them into a prompt.
- It streams an answer from a local Ollama chat model (prefers
llama3.2
if available), then shows source chunks used.
- All data stays local on your machine (Ollama + Chroma local persistence).
- Indexes are stored under
~/.ragit/<hash_of_path>/
. - Files that cannot be parsed are skipped with a clear error message.