cd /news/large-language-models/the-knowledge-authority-layer-what-y… · home topics large-language-models article
[ARTICLE · art-31128] src=dev.to ↗ pub= topic=large-language-models verified=true sentiment=· neutral

The knowledge-authority layer: what your agents can't get from the outside

Sid Probstein, creator of SWIRL and CEO of SWIRL AI, argues that copying enterprise data into a vector database for retrieval-augmented generation (RAG) introduces security and compliance risks. He advocates for a hybrid retrieval approach using keyword and vector methods without a vector database, as demonstrated by Meta's XetHub team. SWIRL 5, launching July 15, provides an MCP server that lets agents retrieve organization-approved answers without copying data.

read3 min views1 publishedJun 17, 2026

Every enterprise AI conversation right now starts in the same place: "connect the model to our data." Then it stalls in the same place: which data, copied where, governed by whom.

I build retrieval for a living (I wrote the original open-source SWIRL), so let me make an argument that runs against the current default - and then show the architecture it implies.

The standard RAG recipe is: crawl your sources, chunk them, embed them, and load the vectors into a database. Now your model can retrieve. It also means you have a second copy of your content living in an index you have to secure, keep in sync, and explain to whoever owns compliance. You've recreated every permission boundary by hand, and you'll eventually get one wrong.

For a lot of teams that copy is simply not allowed. Regulated content, client-confidential material, anything privileged - copying it into a vendor store is exposure you don't get paid to take on. Here's the part people don't want to hear. Meta's XetHub team benchmarked three retrieval strategies: keyword-only (BM25), vector-only, and hybrid (keyword to pull candidates, then re-rank). Keyword-only came last. Vector-only did better.

Hybrid won - and their conclusion was blunt: "No vector database necessary."

That matches what we see in production. Vector similarity is a great high-precision filter, not a great first pass. Lead with exact matches and quoted terms, then let embeddings and a cross-encoder re-rank what's left.

It's not a slogan; it's a pipeline. In SWIRL, relevance is three passes, and both models run locally:

E5-large-v2 , using title-aware chunking and hybrid keyword+vector fusion (RRF). No vector database to build or secure.MS-MARCO

cross-encoder reads the query and document Feed that to your LLM - whatever model you've chosen, including an on-prem one - and the answer gets better, because the context got better. Same model, sharper input.

The stack is settling: foundation models orchestrate, MCP is the retrieval interface, the chat UI is a commodity. The piece none of them provide from outside your walls is knowledge authority - which document is official, which clause your org actually uses, which answer carries approval.

So we made it a first-class layer. SWIRL 5 exposes an MCP server. Any agent - Claude, Copilot, ChatGPT, your own - calls SWIRL and gets ranked, permissioned, organization-approved answers. A team pins the canonical result for a query once; every agent gets it after that. And no copy of your data leaves your tenant.

Three properties fall out of it, and they're the whole reason to build it this way:

If you're wiring agents into enterprise data and the "just copy everything into a vector store" step is making your security team twitch, there's another shape available. SWIRL 5 goes GA July 15; the preview is open if you want to point it at your own stack. Either way - I'd genuinely like to hear how you're handling the authority problem, because I don't think the industry has it figured out yet. Sid Probstein is the creator of SWIRL and CEO of SWIRL AI.

── more in #large-language-models 4 stories · sorted by recency
── more on @sid probstein 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/the-knowledge-author…] indexed:0 read:3min 2026-06-17 ·