cd /news/large-language-models/sproutrag-attention-guided-tree-sear… · home topics large-language-models article
[ARTICLE · art-32070] src=arxiv.org ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG

Researchers introduced SproutRAG, a hierarchical retrieval-augmented generation framework that uses learned inter-sentence attention to organize sentence-level chunks into a binary tree for multi-granularity retrieval without additional LLM calls. The method improves information efficiency by 6.1% on average across four benchmarks in scientific, legal, and open-domain settings.

read1 min views3 publishedJun 18, 2026

arXiv:2606.18381v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems must balance retrieval granularity with contextual coherence, a challenge that existing methods address through LLM-guided chunking, single-level context expansion, or hierarchical summarization. These approaches variously depend on costly LLM calls during indexing or retrieval, limit context aggregation to a single granularity level, or introduce information loss through summarization. We present SproutRAG, an attention-guided hierarchical RAG framework that addresses this trade-off by organizing sentence-level chunks into progressively larger but semantically coherent units, using learned inter-sentence attention to construct a binary chunking tree. Unlike prior approaches that rely on external LLMs, fixed context expansion, or lossy summarization, SproutRAG learns which attention heads and layers best capture semantic document structure, enabling multi-granularity retrieval without additional LLM calls or compressed summaries. At retrieval time, SproutRAG uses hierarchical beam search to retrieve candidates at multiple granularities, capturing multi-sentence relevance beyond flat retrieval. The framework is trained end-to-end with a joint objective that improves both embeddings and tree structure. Experiments across four benchmarks spanning scientific, legal, and open-domain settings demonstrate that SproutRAG improves information efficiency (IE) by 6.1% on average over the strongest baseline. Code is available on https://github.com/AmirAbaskohi/SproutRAG.

── more in #large-language-models 4 stories · sorted by recency
── more on @sproutrag 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/sproutrag-attention-…] indexed:0 read:1min 2026-06-18 ·