Help with a Local Document RAG System (Storage + Ingestion + Query + Highlighting)

wpnews.pro

cd /news/large-language-models/help-with-a-local-document-rag-syste… · home › topics › large-language-models › article

[ARTICLE · art-34722] src=discuss.huggingface.co ↗ pub=2026-06-20T08:44Z topic=large-language-models verified=true sentiment=· neutral

Help with a Local Document RAG System (Storage + Ingestion + Query + Highlighting)

A developer is seeking advice on building a local, offline document retrieval and LLM pipeline for RAG systems, focusing on storage, ingestion, querying, and highlighting. The system aims to support PDF, DOCX, XLSX, CSV, and image formats with local LLM answer generation and citation tracking. Key questions involve vector DB vs pgvector, offline GraphRAG feasibility, and implementing document highlighting with citation preview.

read2 min views1 publishedJun 20, 2026

Hey folks,

I’m working on designing a local, offline document retrieval + LLM pipeline and would love your input on the architecture. Here’s what I’m aiming for:

STORAGE

Upload PDF, DOCX, XLSX, CSV, tables
All data stored locally (no cloud) DOCUMENT INGESTION
Watch folder (e.g., Watchdog) → auto-ingest on file add/modify/delete
Nested folder structure → auto-tagging
Supported formats: PDF, scanned PDF, DOCX, XLSX, CSV, JPG/PNG
Version control on re-upload QUERY & RETRIEVAL

- Restrict queries to a single client’s documents (no cross-client leakage)
- Structured queries (e.g., “Show invoices > ₹1 lakh”)

Comparative queries (e.g., “Compare FY23 vs FY24 gross profit”)
Keyword fallback

HIGHLIGHTING & RENDERING

Annotated PDF served to frontend
XLSX → colored cell export
Jump directly to highlighted page
Multi-document highlights in one response

ANSWER GENERATION

Local LLM only
Every claim cited with doc + page reference

MY QUESTIONS

Parsing: I’m considering LlamaIndex LiteParse.

→ Should I store document IDs + chunk IDs for PDFs to enable highlighting?

Vector DB:

Do I need one (e.g., Qdrant)?
If yes, how do I store doc IDs + chunk IDs alongside embeddings for highlighting?
Would pgvector in Postgres be sufficient?

GraphRAGs:

How effective are systems like Neo4j or Microsoft GraphRAG?
Can they run locally/offline, or are they too computationally heavy?
Is this GraphRAG pipeline from LlamaIndex a good starting point?

Highlighting UX:

I want something like Turnitin/iThenticate reports → exact sentence highlighted + citation.
Any open-source projects that already do this?
I found Kotaemon and AnythingLLM, which are close but don’t highlight documents.

TL;DR

Trying to build a local RAG system with:

Storage + ingestion + tagging
Query + retrieval + highlighting
Local LLM answer generation with citations

Looking for advice on:

Vector DB vs pgvector
GraphRAG feasibility offline
Best way to implement document highlighting + citation preview

Would love to hear from anyone who’s built something similar or explored these tools.

source & further reading

discuss.huggingface.co — original article Rakarrack-0.6.1 port making progress! ( AI assisted ) Cloud Storage Poll Welcome to Haiku basic(Haiku Docs, Haiku slide and Haiku sheets)

~/api · this article 200

$curl api.wpnews.pro/v1/news/help-with-a-local-docume…

Read original on discuss.huggingface.co → discuss.huggingface.co/t/help-with-a-local-docum…

mentioned entities

LlamaIndex

Qdrant

pgvector

Postgres

Neo4j

Microsoft GraphRAG

Kotaemon

AnythingLLM

metadata

slughelp-with-a-local-document-rag-system-storage-ingestion-query-highlighting

topic#large-language-models

secondary4 topics

sentimentneutral

canonicaldiscuss.huggingface.co

navigation

← prevYour docs aren't burning your to…

next →ChatGPT keeps creeping toward be…

── more in #large-language-models 4 stories · sorted by recency

dev.to · 20 Jun · #large-language-models

Chatting with your Data: Conversational Analytics in BigQuery

dev.to · 20 Jun · #large-language-models

AI coding getting pricier? I cut my tokens by 82% (with real data)

dev.to · 20 Jun · #large-language-models

Free Local AI Coding Agent: Cut Dev Costs 90%

dev.to · 19 Jun · #large-language-models

Vector Databases Are Not Magic, Here's What's Actually Happening Under the Hood

── more on @llamaindex 3 stories trending now

wpnews · 19 Jun · #artificial-intelligence

Stop Guessing Which Library to Use — I Built an AI Capability Discovery Engine

wpnews · 19 Jun · #artificial-intelligence

From Dream Job to 'The Gulag': Inside Staff Revolt Zuckerberg's Brutal AI Push

wpnews · 19 Jun · #large-language-models

How Much of Your Blog Does AI Search Actually Grab? Breaking Down Claude's WebSearch and WebFetch

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required