Not Classic RAG: Building a Structured-Retrieval Discovery Agent with LangGraph

wpnews.pro

I just added a new feature to Kino, my educational movie-discovery project built with LangGraph: a prompt-driven discovery flow that finds grounded titles from a local catalog.

The easy label would be RAG.

More precisely, it is not classic RAG.

What I built is closer to a

structured-retrieval agent: the model interprets the request, a narrow service returns structured facts, and deterministic code enforces the final result.

That distinction sounds academic at first.

In practice, it changed almost every implementation choice.

Kino is an educational project, but this feature forced a very real architecture decision.

I wanted a user to be able to type something like Discover comedy movies from 2010 onward from Kino's catalog.

and get back grounded titles from the project's own data.

The final flow is intentionally small:

search_titles

tool turns that into a structured catalog queryThat final payload is also deliberately narrow: titles

plus notes

.

I removed extra narrative fields because the system is not trying to be a movie critic. It is trying to be a grounded discovery interface.

A lot of AI features get called RAG by default.

That label is too broad here.

Classic RAG usually means retrieving unstructured text, such as document chunks, and injecting that material back into the model context so the model can answer from it.

The project is doing something different.

The tool is not a web search. It is not a vector search over article chunks. It is not a document retriever.

It is a domain-specific structured retrieval API.

The model is translating natural language into structured query arguments like:

genres

title_type

is_adult

min_year

max_year

Then the catalog service returns structured records.

That puts the feature much closer to a structured-retrieval agent than classic document RAG.

If you want the research lineage, the loop is also closer to ReAct than to a standard document-grounded RAG pipeline. The most useful taxonomy I found for explaining this is Anthropic's framing of workflows and agents, along with the corresponding LangGraph workflows and agents documentation.

Under that framing, this feature sits in the middle.

It is not a pure workflow, because the model is doing real interpretation work.

It is not a free-form autonomous agent either, because the tool surface is tiny and the final output is tightly controlled.

The best plain-English description is: a workflow-agent hybrid over a structured retrieval backend

Or, shorter:

a structured-retrieval discovery agent

That phrasing matters because it tells people what the model is actually allowed to do.

Once I stopped pretending this was generic RAG or open-ended recommendation, a few design choices became obvious.

recommend

to discover

That was not branding fluff.

Recommend

suggests some kind of ranking intelligence or taste layer.

At the moment, the project does not have ratings, popularity signals, user history, or editorial scoring. So calling it a recommendation engine would be overclaiming.

Discover

is much more honest.

It says: the system helps the user find grounded candidates from the catalog.

This was one of the most important separations in the whole feature.

The LLM is good at understanding phrases like:

from 2010 onward

between 1990 and 2000

comedy movies

non-adult

So the model should do that linguistic work.

But the system should still enforce the structured constraints.

That is why I moved toward explicit query arguments like min_year

and max_year

, and why the backend contract now uses clearer minYear

and maxYear

semantics instead of leaking database-style operator names into the user-facing architecture.

In other words:

That split ended up being much cleaner than trying to re-parse English later in the pipeline.

At one point, the response included extra explanation fields like reason

and tradeoff

.

In theory, that sounds useful.

In practice, for a narrow structured-retrieval flow, those fields were mostly filler.

They pushed me toward brittle heuristics or shallow generated text without adding much product value.

So I simplified the contract.

Now the system returns only the grounded titles and any relevant notes.

That made the feature easier to trust and easier to inspect.

One nice side effect of building this in LangGraph is that the graph is inspectable in LangSmith Studio.

That matters more than it sounds.

When you can literally see the flow as:

__start__

model

tools

model

after_agent

__end__

it becomes much easier to explain what the system is and is not doing.

It also made debugging much more concrete.

I could see when the model produced the right year bounds, when the tool call returned old titles, when a provider was failing upstream, and when a stale local server was being mistaken for the deployed agent.

For an educational project, that visibility is a real advantage. The architecture is not hidden behind marketing language.

This feature is useful, but it is still intentionally narrow.

It is not:

And some user preferences still cannot be enforced unless the catalog actually has the right metadata.

For example, general audience sounds natural in a prompt, but it is only enforceable if the underlying data exposes a reliable signal for that concept.

There is also a more boring operational truth: provider reliability still matters.

Even when the architecture is clean, upstream model providers can still return transient failures, timeouts, or internal errors.

That is exactly why narrowing the tool surface and keeping the final response contract deterministic was worth it.

The practical lesson from this feature is simple:

If your domain data is structured, do not force a classic RAG story onto it.

Sometimes the better answer is a small agent that interprets the user's language, calls a narrow retrieval API, and hands the final truth back to deterministic code.

That is what I ended up building in this project.

Not a generic AI search box.

Not a recommendation engine.

A structured-retrieval discovery agent.

And once I named it correctly, the implementation got better.

source & further reading

eido-askayo.blogspot.com — original article Claude Opus 5: What Happens When an AI Agent Can Keep Working Longer Than You Can? Does Claude Have a “Subconscious”? Anthropic Found a Limited Window Into Its Silent Reasoning. I Built PA-Trace: An On-Device MedGemma Workflow for Prior Authorization

Not Classic RAG: Building a Structured-Retrieval Discovery Agent with LangGraph

Run your AI side-project on zahid.host