Why Twio Chose Vertex AI Search over pgvector for Production RAG

wpnews.pro

cd /news/artificial-intelligence/why-twio-chose-vertex-ai-search-over… · home › topics › artificial-intelligence › article

[ARTICLE · art-35977] src=dev.to ↗ pub=2026-06-22T00:00Z topic=artificial-intelligence verified=true sentiment=· neutral

Why Twio Chose Vertex AI Search over pgvector for Production RAG

Twio, an AI SaaS for loan brokers, migrated its production RAG system from pgvector to Vertex AI Search after scaling challenges revealed that vector storage alone was insufficient for handling messy broker documents, emails, and attachments. The company found that Vertex AI Search reduced engineering overhead by managing OCR, chunking, indexing, and retrieval, making it cheaper overall despite higher direct costs. Twio's experience shows that starting with a simple tool like pgvector to learn fast, then moving to a managed solution like Vertex AI Search for production reliability, is a pragmatic evolution.

read4 min views1 publishedJun 22, 2026

When we first built RAG at Twio, pgvector was the obvious pick. Our business data was already in PostgreSQL, and dropping embeddings into the same database was the fastest path to a working product.

For the first version, that was right. As we scaled, the problem stopped being "how do we store vectors?" and became "how do we reliably understand thousands of broker documents, emails, and attachments in production?" That changed the answer. Today, Vertex AI Search is our main retrieval layer. Twio is an AI SaaS for loan brokers. A single client case is a mess of fragmented information:

The AI needs to answer questions like:

If retrieval is weak, the answer is weak. If indexing lags, context is missing. If parsing is wrong, the model sees the wrong evidence. RAG isn't a feature on the side — it's the memory layer of the product. Twio is a multi-tenant SaaS, so retrieval can't just return "similar content" — it has to return similar content scoped to the right user, client, application, or file. pgvector made that trivial: embeddings sat next to the business records, joined cleanly, and filtered with plain SQL.

The early wins were real:

It let us build the first version quickly and learn from actual usage. That matters more than people give it credit for.

pgvector didn't fail. It did exactly what it's designed for. The issue was that vector storage is only one slice of the RAG pipeline, and pgvector left every other slice to us:

A clean PDF is easy. A scanned bank statement isn't. An email body is easy. An email with five attachments, lender forms, tables, and partial OCR isn't. A demo dataset is easy. A real broker workspace with years of historical emails isn't.

With pgvector, every weakness in that pipeline was ours to fix. When retrieval quality dropped, the suspect list ran all the way from OCR through chunking and embedding to vector distance, SQL filtering, ranking, and DB performance. The extension is simple. The production RAG system around it isn't.

The cost shifted from cloud bill to engineering time — and engineering time was the constrained resource.

Scenario	pgvector	Vertex AI Search
Clean text PDF	We own extraction, chunking, embedding, storage, search	Vertex handles most of the indexing and retrieval workflow
Scanned document	We build or integrate OCR ourselves	Vertex absorbs much of the document-processing logic
Broker asks a document question	We own query design, ranking, filtering	Managed search with stronger out-of-the-box quality
Attachment bursts	Postgres carries more search and indexing load	Search workload lives outside the main database
Debugging	Excellent SQL visibility, but many custom layers to inspect	Less low-level control, but far less custom infra to debug
Cost	Lower direct service cost	Higher service cost, lower engineering and maintenance cost
Production readiness	Significant custom work required	Easier to operate as a managed layer

pgvector was cheaper as a database extension. Vertex is cheaper as a product decision. The cloud bill is one input; engineering time, reliability, and iteration speed are the bigger ones at our stage.

Twio's RAG problem is document-heavy. We aren't searching short snippets — we're dealing with messy broker PDFs, scans, forms, tables, and forwarded attachments. Vertex helps in four concrete ways:

Vertex isn't free, but the alternative isn't either. Building OCR, indexing, ranking, monitoring, and tuning ourselves has its own bill — paid in engineer-weeks.

pgvector is still a strong choice when:

For us, it was the right first implementation — and it taught us what retrieval the product actually needed. It may stay in the stack for internal or fallback use cases. The lesson from Twio's RAG evolution is simple:

Start with the tool that helps you learn fastest. Move to the tool that helps you operate best.

pgvector got us to a working RAG system quickly. As the product matured, the real challenge shifted to document processing, indexing quality, and operational reliability — and at that point, Vertex AI Search became the better fit. It costs more as a service and less as a system to maintain. For a SaaS at Twio's stage, that's the trade that matters.

source & further reading

dev.to — original article Why Your Agent's Search Results Look Right and Are Wrong: The Index Distribution Problem Stop Telling Your AI to "Be Careful Next Time." It Has No Memory of Yesterday. Harness Engineering Has No Fixed Address

~/api · this article 200

$curl api.wpnews.pro/v1/news/why-twio-chose-vertex-ai…

Read original on dev.to → dev.to/twio_ai/why-twio-chose-vertex-ai-search-o…

mentioned entities

Twio

Vertex AI Search

pgvector

PostgreSQL

metadata

slugwhy-twio-chose-vertex-ai-search-over-pgvector-for-production-rag

topic#artificial-intelligence

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevValuation Analysis of Top Tech S…

next →container escape is becoming an …

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 22 Jun · #artificial-intelligence

Stop Telling Your AI to "Be Careful Next Time." It Has No Memory of Yesterday.

narracomm.com · 21 Jun · #artificial-intelligence

Touchmark (touchmark.ai): Quality-Based Pricing Infrastructure That Charges AI by Value — Not Tokens | YC S26

dev.to · 21 Jun · #artificial-intelligence

The Agent Is the Harness, Not the Model — and Why That Reorganizes Software Engineering

dev.to · 20 Jun · #artificial-intelligence

RAG Pipeline: Complete Node.js Implementation Guide

── more on @twio 3 stories trending now

wpnews · 21 Jun · #large-language-models

Anthropic faces a class action lawsuit accusing it of selling Claude Max subscribers far less than advertised

wpnews · 21 Jun · #artificial-intelligence

Plotting AI model release cadence: two labs are accelerating, three aren't

wpnews · 21 Jun · #ai-safety

Author Argues for Slower AI Despite Cancer Benefits

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required