I Ditched Vector Search for My Coding Agent's Memory. FTS5 Won.

wpnews.pro

cd /news/developer-tools/i-ditched-vector-search-for-my-codin… · home › topics › developer-tools › article

[ARTICLE · art-47642] src=dev.to ↗ pub=2026-07-04T05:31Z topic=developer-tools verified=true sentiment=↑ positive

I Ditched Vector Search for My Coding Agent's Memory. FTS5 Won.

A developer building a coding agent's memory system chose SQLite's FTS5 full-text search over vector search, arguing that for structured, keyword-dense content like stack traces, logs, and error codes, exact-match retrieval with BM25 ranking outperforms semantic similarity. The implementation uses a simple FTS5 virtual table to index and search tool output, git logs, and API responses, avoiding the overhead of embedding models and vector databases.

read4 min views1 publishedJul 4, 2026

Every "give your agent memory" tutorial I've read reaches for the same stack: chunk your docs, embed them, throw the vectors in a database, do cosine similarity at query time. So when I needed my coding agent to search through indexed tool output, git logs, and fetched docs without dumping raw text into the model's context window, I assumed I'd be standing up a vector store too.

I didn't. I used SQLite's FTS5 full-text search instead, and for this specific job it's not a compromise — it's the better tool.

The tool I built (context-mode

, for routing large command output and API responses out of the model's context) needs to answer queries like:

against arbitrary shell output, JSON responses, and fetched web pages — indexed once, searched however many times a session needs. The naive version just dumps everything into context and lets the model read it. That works until the output is 50KB of test logs and you've burned half your context window on a summary you needed three lines of.

Vector search is built to answer "what's semantically similar to this." That's the right tool when you're searching prose — support tickets, documentation, chat transcripts — where the same idea gets expressed in different words and you need "how do I reset my password" to match a doc titled "Account Recovery Steps."

Coding-agent queries mostly aren't that. "HTTP 500 errors" isn't a fuzzy semantic concept I want approximated — it's closer to a literal grep with better ranking. The content being searched is also structured and keyword-dense: stack traces, log lines, JSON keys, error codes. Embedding a stack trace and comparing cosine similarity throws away the thing that actually matters (the literal exception name, the literal line number) in favor of a vector representation that's better at "these two paragraphs are about similar topics" than "this line contains the string ECONNREFUSED

FTS5 is built for exactly this: tokenized, indexed, ranked full-text search over exact and near-exact term matches, with BM25-style relevance scoring out of the box.

No embedding model, no vector database, no network round-trip to compute embeddings. It's stdlib:

import sqlite3

conn = sqlite3.connect("index.db")
conn.execute("""
    CREATE VIRTUAL TABLE IF NOT EXISTS docs
    USING fts5(source, content)
""")

def index(source: str, content: str):
    conn.execute("INSERT INTO docs (source, content) VALUES (?, ?)", (source, content))
    conn.commit()

def search(query: str, limit: int = 5):
    rows = conn.execute("""
        SELECT source, snippet(docs, 1, '[', ']', '...', 20), rank
        FROM docs WHERE docs MATCH ? ORDER BY rank LIMIT ?
    """, (query, limit)).fetchall()
    return rows

That's the whole engine. snippet()

gives you highlighted context around the match for free. rank

gives you BM25 ordering for free. Querying "HTTP 500 errors" against a batch of indexed test output returns the actual lines containing 500

and error

, ranked by term frequency and rarity — not the semantically-nearest paragraph, the actually-relevant one.

FTS5 is a bad choice if your queries genuinely need semantic matching: "find the doc about resetting my password" needs to match "Account Recovery," and no amount of tokenization gets you there without embeddings. If I were building search over a knowledge base of prose documentation with inconsistent terminology, I'd reach for vectors, possibly hybrid (BM25 for recall, vectors for semantic re-ranking).

But an agent's own tool output, error logs, and fetched API responses are dense with the literal terms you're going to search for, because you (or the agent) wrote the query with those terms in mind. "Failing tests" as a query is going to co-occur with FAIL

, AssertionError

, test names — words that are actually in the log. The semantic gap that justifies embeddings mostly doesn't exist in this domain.

"Add semantic search" has become a reflex the same way "add a cache" or "add a queue" is — reached for because it's the default answer to "how do I search this," not because the problem demands it. Vector infra costs you an embedding model, a vector database or extension, and a slower indexing step, in exchange for a capability — semantic similarity — that keyword-dense, structured content usually doesn't need.

Before reaching for embeddings on your next "agent needs to search X" problem, ask what the query and the content actually look like. If both are keyword-dense and structurally similar (logs, code, JSON, stack traces), full-text search with BM25 ranking will outperform vectors on relevance and cost you a fraction of the infrastructure. Save the vector database for the day your content is actually prose with vocabulary mismatch — most agent tooling isn't there yet.

source & further reading

dev.to — original article The Photo Management Paradox: Why We Hoard and How Lightweight Tools Are Winning What building a real patient management system taught me about "healthcare AI developer" Sonnet 5 vs GLM-5.2 vs everyone: how to pick the cheapest LLM API in 2026

~/api · this article 200

$curl api.wpnews.pro/v1/news/i-ditched-vector-search-…

Read original on dev.to → dev.to/enjoy_kumawat/i-ditched-vector-search-for…

mentioned entities

SQLite

FTS5

BM25

context-mode

metadata

slugi-ditched-vector-search-for-my-coding-agent-s-memory-fts5-won

topic#developer-tools

secondary3 topics

sentimentpositive

canonicaldev.to

navigation

← prevThe Photo Management Paradox: Wh…

── more in #developer-tools 4 stories · sorted by recency

dev.to · 4 Jul · #developer-tools

Six Bugs Only a Live Model Could Teach Us

dev.to · 4 Jul · #developer-tools

Building an AI-Powered Photo Cleaner: Lessons from the App Store

github.com · 4 Jul · #developer-tools

Show HN: Stop Destructive Agent Commands Before They Happen

github.com · 4 Jul · #developer-tools

Show HN: Crew – Let Claude Code agents talk to each other

── more on @sqlite 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required