# Chat With Your Documents Using Garudust Agent — No Vector Database Required

> Source: <https://dev.to/garudust/chat-with-your-documents-using-garudust-agent-no-vector-database-required-1m61>
> Published: 2026-05-21 06:47:49+00:00

Most RAG tutorials start the same way: "First, install a vector database…" Then come the embedding models, the chunking strategies, the similarity thresholds. By the time you can ask a question about a PDF, you've deployed three services and written 200 lines of boilerplate.
Garudust Agent takes a different path. RAG is built in — backed by SQLite FTS5 with a trigram tokenizer. No vector database. No embedding API calls. Drop a PDF (or TXT, CSV, Markdown, JSON) into the conversation and start asking questions in seconds.
When you ingest a document, Garudust:
tokenize = 'trigram'
When you ask a question, doc_search
runs a full-text query against the index and feeds the top matching chunks to the LLM as context. That's the whole pipeline — one SQLite file at ~/.garudust/state.db
.
The trigram tokenizer means it works on any language, including Thai, Chinese, and Japanese, without any tokenizer configuration.
RAG is enabled by default. The only thing you need to configure is which directories the agent is allowed to read from:
# ~/.garudust/config.yaml
security:
allowed_read_paths:
- /home/you/documents
- /data/company-docs
That's it. If you want to turn RAG off entirely:
disabled_toolsets: [rag]
Start the CLI:
garudust
Then tell the agent to ingest a file:
You: ingest /home/you/documents/employee-handbook.pdf
Agent: Indexed employee-handbook.pdf — 47 chunks ready for search.
Preview: "This handbook outlines the policies and procedures for all employees…"
Now ask anything:
You: What is the remote work policy?
Agent: According to the employee handbook, remote work is permitted up to 3 days per week
for roles that do not require on-site presence. Employees must notify their manager
at least 24 hours in advance and maintain availability during core hours (10am–4pm).
You never call these directly — the agent decides when to use them based on your question. But knowing they exist helps you understand what's happening.
If a document changes, just ingest it again. The old index for that path is replaced automatically.
You: Remove the Q1 report from the index.
Agent: Document removed from index.
Or clear everything:
You: Clear all indexed documents.
Agent: Removed 5 document(s) from index.
doc_search
supports full FTS5 syntax, which the agent uses automatically when your question benefits from it:
You don't need to write FTS5 queries yourself — the agent figures this out. But if you want to guide it:
You: Search for "termination clause" in the contract documents.
Ingest your onboarding docs, SOPs, and internal wikis. New team members can ask questions in plain language instead of searching through Confluence.
You: ingest /docs/sop-release-process.md
You: What approvals are needed before a hotfix can go to production?
You: ingest /legal/vendor-agreement-2025.pdf
You: Does this contract include a limitation of liability clause? What is the cap?
Ingest a log file and ask questions without writing grep patterns:
You: ingest /var/log/app/error.log
You: Which service caused the most errors in the last hour?
You: Are there any database connection timeouts?
You: ingest /project/docs/api-reference.md
You: What parameters does the /auth/refresh endpoint accept?
If you're running garudust-server
with a messaging platform, users can send files directly to the bot. Attachments are automatically saved to a temporary path and can be ingested on request:
User sends: quarterly-report.pdf (attached)
Agent: I received your file. Would you like me to index it for search?
User: yes
Agent: Indexed quarterly-report.pdf — 83 chunks ready.
Preview: "Q1 2025 Financial Summary — Total Revenue: $4.2M…"
User: What was the gross margin for Q1?
Agent: According to the report, gross margin for Q1 2025 was 61.3%,
up from 58.9% in Q4 2024.
Platform attachments (files from Telegram, LINE, Discord, etc.) are always allowed regardless of
allowed_read_paths
, since they're written to/tmp/garudust_*
by the platform adapter.
You can ingest multiple files and search across all of them in the same session:
You: ingest /docs/policy-2024.pdf
You: ingest /docs/policy-2025.pdf
You: What changed in the travel expense policy between 2024 and 2025?
The agent searches both documents and synthesizes the differences.
Check what's indexed at any time:
You: What documents have you indexed?
Agent: 2 documents indexed:
- policy-2024.pdf | 34 chunks | ingested 2025-05-21 09:14
- policy-2025.pdf | 38 chunks | ingested 2025-05-21 09:15
state.db
, but searches are scoped to the current conversation key. Starting a new session means re-ingesting if you want to query the same files.Garudust's RAG won't replace a purpose-built vector search pipeline for large-scale production retrieval. But for a developer who wants to ask questions about their documents right now — without running a second service — it's the fastest path from PDF to answer.
