Treat the Context Window as a Data Assembly Problem

A developer argues that assembling context for large language models is fundamentally a data assembly problem, not a prompt engineering one. The author introduces pydantic-resolve as a tool to structure context assembly declaratively, similar to how API response assembly is handled in FastAPI, to avoid procedural code that mixes database queries, vector retrieval, and LLM calls.

Treat the Context Window as a Data Assembly Problem: Where pydantic-resolve Fits in AI Workflows A typical piece of AI code Open any service in your project that calls an LLM. You will most likely see a function that looks something like this: php async def build support context ticket id: int - str: ticket = await db.get Ticket, ticket id customer = await db.get Customer, ticket.customer id recent tickets = await db.query Ticket .filter Ticket.customer id == customer.id .order by Ticket.created at.desc .limit 5 .all Retrieve similar past tickets embedding = await embed ticket.description similar = await vector store.search embedding, top k=3 Each similar ticket needs its resolution pulled in similar with resolution = for s in similar: resolution = await db.query Resolution .filter Resolution.ticket id == s.id .first similar with resolution.append { "title": s.title, "resolution": resolution.text if resolution else "", } Collect tags all tags = for t in recent tickets: all tags.extend t.tags Finally, ask the LLM to summarize summary = await llm.summarize customer=customer, recent tickets=recent tickets, similar=similar with resolution, return f""" Customer: {customer.name} id={customer.id} Recent tickets: {len recent tickets } Tags: {', '.join set all tags } Similar past cases: {format similar similar with resolution } Summary: {summary} """ The function is not long, but the problem is already visible: this is a build context function that is fundamentally doing data assembly, but its shape is entirely procedural . It is isomorphic to the FastAPI code that Clean Architecture for Python ../architecture entity first/ criticizes — only "assembling an API response" has been swapped for "assembling a prompt context". The problems are unchanged: - Data-fetching logic is scattered through the function body with no structure. - Dependencies of derived fields all tags , summary are held together by comments and line ordering. - Vector retrieval, database queries, and LLM calls live in one function. Every new piece of context means editing this function. - Concurrency optimization fetching similar tickets in parallel requires a rewrite. - Reuse — say, exposing recent tickets to the frontend too — is impossible. This code is not badly written. It has no home . "The context window" is a data assembly problem When people discuss LLM applications, attention usually lands first on prompt templates, model choice, and temperature. Those matter — but as applications grow, the real bottleneck shifts from prompt engineering to context assembly . The reason: prompt templates are stable, model choice is stable, but "what data to feed the LLM" differs on every call . A support agent handling ticket A and ticket B can share the same prompt template, yet the underlying data-assembly path may diverge completely — A is a VIP customer requiring SLA context and similar-case retrieval; B is a regular customer needing only the basics. This "same template, different data-assembly path" requirement is exactly what API response assembly does . Your FastAPI project already solves it — different endpoints assemble different response trees. An LLM context is just another endpoint, only the consumer is an LLM rather than an HTTP client. Once that perspective lands, the problem becomes concrete. The things pydantic-resolve solves well on the API side hold equally well on the LLM side: | API response assembly | LLM context assembly | |---|---| | Multi-level nesting Sprint → Task → Owner | Multi-level nesting Customer → Ticket → Similar Ticket | | Batch-load related data | Batch-recall related context | Derived fields task count , contributors | Derived context summary , aggregated tags | | N+1 database queries | N+1 vector retrievals + N+1 LLM calls | | Cross-subtree aggregation deduplicate all owners | Cross-subtree aggregation merge evidence across similar tickets | Every item in the right column already has a solution on the left. We only need to bring the same machinery over. Three classic assembly pain points Breaking the build support context snippet apart reveals three symptom classes. They are not specific to support scenarios — they recur in nearly every LLM application. Pain point 1: N+1 LLM calls for s in similar: resolution = await db.query Resolution .filter ... .first This is a classic N+1 on the ORM side. In the LLM world it gets worse — you might be calling the LLM in the loop: for s in similar: s.summary = await llm.summarize s.description 5 similar tickets = 5 serial LLM calls LLM calls are an order of magnitude more expensive than database queries. Serial N+1 directly amplifies cost and latency. And code without a batching abstraction always ends up like this , because nobody manually maintains a batch queue inside procedural code. Real-world evidence: open-webui backend/open webui/utils/middleware.py:2635 commit 02dc3e6 , 2026-06 for sid in all skill ids: if sid in accessible skill ids: s = await SkillsModel.get skill by id sid serial N+1 The same file has at least three more instances folder lookup, tool connection, access check , all await -inside-a- for . open-webui is a production-grade AI application, and it still falls into this trap — evidence that the trap is structural, not a coding-quality issue. Pain point 2: Cross-subtree aggregation has no home all tags = for t in recent tickets: all tags.extend t.tags This "walk the subtree and collect things" logic, in procedural code, can only be written as global variables plus a for loop. As soon as aggregation needs grow — all similar-ticket resolutions, all products mentioned, all features touched — you get a pile of all xxx = lists scattered across the function, held together by convention. What makes this worse is that these aggregations are inherently "parent depends on children" . In procedural code, they are separated from child-fetch logic. Fetching is above the for loop; aggregation is below. The parent→child dependency has been reduced to "line number ordering". Real-world evidence: open-webui backend/open webui/utils/middleware.py chat-completion orchestration commit 02dc3e6 , 2026-06 sources = sources.extend flags.get 'sources', line 2882 sources.extend flags.get 'sources', line 2892 sources = s for s in sources if ... line 2909: mid-function reassignment events.append {'sources': sources} line 2916: another accumulator sources and events have no structured parent-child dependency declaration — they're stitched across handlers with extend . This is exactly the "aggregation has no home" pattern from the previous section — not a one-off defect, but the inevitable shape of procedural code that has to coordinate context across multiple sources. Pain point 3: Prompt shape is welded to data fetching return f""" Customer: {customer.name} id={customer.id} ... Summary: {summary} """ This final f-string welds three things together: data fetching, derived computation, prompt format . Touching the prompt template means touching the data code; touching data fetching means touching the prompt text; adding a field means editing from top to bottom. This is the limit of procedural code: it has no structure, so every change is invasive . Real-world evidence: open-webui backend/open webui/utils/middleware.py:931 get source context commit 02dc3e6 , 2026-06 php def get source context sources, ... - str: context string = '' for source in sources: for doc, meta in zip source.get 'document', , source.get 'metadata', : context string += f'<source id="{...}" name="{...}" {body}</source \n' return context string Iteration, XML template string, and f-string formatting all welded into one function — structurally identical to the hypothetical build support context at the top of this article. Not a coincidence; this is the typical shape of procedural LLM code. Redefinition: LLM context = response tree With the three pain points diagnosed, the fix is clear: assemble the LLM context as a response tree . On the API side, you already speak this language: class SprintView BaseModel : id: int name: str tasks: list TaskView = task count: int = 0 post def resolve tasks self, loader=Loader task loader : return loader.load self.id def post task count self : return len self.tasks Bringing this language to the LLM case requires only a change of perspective: the tree root is no longer Sprint but some conversation context; the leaves are no longer Task but some field an LLM will read . When you model dump the result, you either feed it to a prompt template or JSON-serialize it as a tool-call argument. flowchart LR subgraph Tree "Context response tree" Ctx "SupportContext<br/ conversation context" Cust "CustomerView" Tickets "list TicketView " Similar "list SimilarTicketView " Summary "summary post " Ctx -- Cust Ctx -- Tickets Tickets -- Similar Ctx -- Summary end Tree -- |model dump + prompt template| LLM "LLM" The tree shape is defined by your Pydantic model; data is fetched by resolve ; derived fields are computed by post ; cross-subtree aggregation is handled by Collector . Same machinery as an API response, same Resolver, same batch loaders. Mechanism mapping Putting that perspective into code gives three one-to-one mappings: | AI assembly need | pydantic-resolve primitive | Role in the LLM scenario | |---|---|---| | Pull external knowledge DB, vector store, external APIs | resolve + Loader | Recall related docs, similar tickets, user profile | | Call the LLM for derivation after subtree is ready | post supports async | Summary, classification, risk assessment — post execution timing guarantees a complete subtree | | Aggregate evidence / tags / fragments across subtrees | Collector + SendTo | Pool signals scattered across leaves back to the root, feed them to the LLM as grounding | These three primitives cover the three pain points exactly. The next section walks through a concrete example. Walkthrough: a customer support agent context Rewriting the opening build support context with pydantic-resolve. First, the model definitions: python from typing import Annotated, Optional from pydantic import BaseModel from pydantic resolve import Collector, Loader, Resolver, SendTo, build list, build object, ---------- Data access layer loaders ---------- async def customer loader customer ids: list int - list CustomerView : rows = await db.query Customer .filter Customer.id.in customer ids .all return build object rows, customer ids, lambda c: c.id async def ticket loader ticket ids: list int - list dict : Used to pull a customer's most recent tickets by customer id rows = await db.query Ticket .filter Ticket.customer id.in ticket ids .order by Ticket.created at.desc .limit 5 len ticket ids .all return build list rows, ticket ids, lambda t: t.customer id async def similar ticket loader ticket ids: list int - dict int, list dict : One batched vector recall: search all query embeddings at once queries = await db.query Ticket .filter Ticket.id.in ticket ids .all embeddings = await embed batch t.description for t in queries results = await vector store.batch search embeddings, top k=3 return { t.id: r.dict for r in results i for i, t in enumerate queries } ---------- Context model tree ---------- class SimilarTicketView BaseModel : id: int title: str resolution: str = "" def resolve resolution self, loader=Loader resolution loader : return loader.load self.id class TicketView BaseModel : id: int title: str description: str customer id: int tags: list str = similar: list SimilarTicketView = resolution summary: str = "" post , LLM-derived def resolve similar self, loader=Loader similar ticket loader : return loader.load self.id async def post resolution summary self : LLM called after subtree is ready: every similar.resolution has been resolved if not self.similar: return "" return await llm.summarize resolutions ticket title=self.title, resolutions= s.resolution for s in self.similar , class SupportContext BaseModel : """Root context: maps directly to the information one LLM call needs.""" ticket id: int ticket: Optional TicketView = None customer: Optional CustomerView = None recent tickets: list TicketView = Collector aggregates tags from all child tickets all tags: list str = Root-level LLM summary: runs only after the whole subtree is ready grounded summary: str = "" def resolve ticket self, loader=Loader ticket by id loader : return loader.load self.ticket id def resolve customer self, loader=Loader customer loader : return loader.load self.ticket.customer id if self.ticket else None def resolve recent tickets self, loader=Loader ticket loader : return loader.load self.customer.id if self.customer else def post all tags self, collector=Collector "tag pool" : Collector gathers tags upward from all child TicketViews return sorted set collector.values async def post grounded summary self : LLM call after the entire tree is ready return await llm.summarize context customer=self.customer, ticket=self.ticket, recent=self.recent tickets, all tags=self.all tags, class TicketView TicketView : The same TicketView feeds both recent tickets and the tag collector tags: Annotated list str , SendTo "tag pool" = Invocation: ctx = SupportContext ticket id=42 ctx = await Resolver .resolve ctx prompt = render prompt ctx.model dump feed straight into a template response = await llm.chat prompt Execution flow php flowchart TB A "Resolver .resolve SupportContext ticket id=42 " -- B "resolve ticket<br/ fetch main ticket" B -- C "resolve customer<br/ fetch customer" C -- D "resolve recent tickets<br/ batch fetch customer's 5 most recent tickets" D -- E "each TicketView.resolve similar<br/ batch vector recall" E -- F "each SimilarTicketView.resolve resolution<br/ batch fetch resolutions" F -- G "each TicketView.post resolution summary<br/ batch LLM summary" G -- H "SupportContext.post all tags<br/ Collector aggregates all tags" H -- I "SupportContext.post grounded summary<br/ root-level LLM summary" I -- J "ctx.model dump " Each pain point is addressed in turn: Pain point 1 N+1 LLM calls : All TicketView.post resolution summary calls sit at the same depth, and pydantic-resolve dispatches them in a batch — no need to manually gather inside a loop. If you want to push batching further, wrap the LLM call itself in a Loader multiple same-template requests collapse into one batch API call . Pain point 2 cross-subtree aggregation : all tags flows through Collector "tag pool" ; TicketView.tags declares SendTo "tag pool" to ship values upward. Aggregation has a fixed home — no more for loops and global variables. Pain point 3 shape welded to fetching : The prompt template and the model definition are separated — render prompt ctx.model dump . Editing the prompt text touches no model code; adding a field doesn't move the template; every fetch lives independently inside its resolve . Output print ctx.model dump json indent=2 { "ticket id": 42, "ticket": { "id": 42, "title": "Login button unresponsive on Safari", "description": "...", "tags": "auth", "safari" , "similar": { "id": 101, "title": "Safari click event issue", "resolution": "..." }, { "id": 187, "title": "WebKit pointer-events bug", "resolution": "..." } , "resolution summary": "Likely a WebKit pointer-events issue; see ticket 187." }, "customer": { "id": 7, "name": "Acme Corp", "tier": "enterprise" }, "recent tickets": / ... / , "all tags": "auth", "billing", "safari", "webkit" , "grounded summary": "Enterprise customer Acme Corp reported a Safari-specific login issue..." } This tree can be serialized and fed straight into an LLM, or sliced apart — return the recent tickets field to a frontend dashboard with zero extra code. Comparison with other approaches | Approach | Where assembly lives | N+1 protection | Cross-subtree aggregation | Reuse with API responses | |---|---|---|---|---| Hand-written build context | Inlined in function body | None | Globals / for loops | None | | LangChain retrieval chain | Chained nodes | Implementation-dependent | Glued via chain composition | Fully separated from API | | Naked RAG embed → search → stuff | A few inlined lines | Usually single-shot | None | Fully separated from API | | pydantic-resolve context tree | Model field declarations | Built-in batching | Collector / SendTo | Same source as API responses | Worth noting: this is not a replacement for LangChain. LangChain orchestrates the sequence of LLM calls; pydantic-resolve assembles the structured context each step consumes. In a complex agent pipeline the two stack cleanly: pydantic-resolve prepares structured context for every step; LangChain or any agent framework schedules the execution. One Entity graph, four consumer types Push this further and a deeper payoff appears. Once a project is in ERD mode, REST, GraphQL, MCP, and LLM Context all derive from the same Entity graph : flowchart TB ERD "Entity + ER Diagram<br/ the single source of relationships" ERD -- REST "REST Response<br/ traditional API consumer" ERD -- GQL "GraphQL<br/ flexible-query consumer" ERD -- MCP "MCP Service<br/ AI agent tool consumer" ERD -- CTX "LLM Context Tree<br/ AI agent context consumer" REST -- Resolver "same Resolver engine" GQL -- Resolver CTX -- Resolver MCP -- GQL Concretely: - The TicketView you wrote for the support dashboard is the TicketView the LLM sees, is the GraphQL node MCP exposes. - The "Task has one owner" relationship is defined once and reused by all four consumers automatically. - Change the relationship — all four places update together. Add a consumer — the relationship definition stays untouched. This is where pydantic-resolve truly fits in AI workflows — not another LLM framework, but a stable home for AI context assembly . As AI agents become a standard consumer in your system, the dividend of "same source" compounds. When to use it, when not to Use it when: - LLM context needs 2+ levels of nesting root + related data + related-of-related . - The same domain model serves both an API and an LLM. - You have a loop calling the LLM per item — N+1 is burning money. - Cross-subtree aggregation is needed to ground the LLM evidence, tags, fragments . - A multi-step agent pipeline where every step needs its own context assembled. Skip it when: - The context is a static text plus a few variables — f-string it. - It's a one-shot script or prototype — procedural code is faster. - There's a single LLM call with no related-data fetch — resolve is unnecessary abstraction. - LangChain is already in place and the chain is stable — adding another layer adds cognitive load without benefit. The heuristic is simple: when you start writing the second build xxx context function and notice it overlaps with the first, it's time to migrate . This is the same adoption signal pydantic-resolve uses on the API side — only this time, the consumer is an LLM instead of a browser. Conclusion The complexity of LLM applications ultimately lands on context assembly , not on prompt templates. Today's AI projects are full of hand-written build context functions carrying the same scattered logic that Service/Route layers used to carry in FastAPI projects — and pydantic-resolve already solved that once on the API side. Treating LLM context as a response tree, three primitives cover three pain points: resolve pulls external knowledge with built-in batching, killing N+1. post is the LLM hook, batch-dispatched after the subtree is ready — prompt shape decoupled from data fetching. Collector / SendTo give cross-subtree aggregation a fixed home, replacing global variables. The broader payoff: your Entity graph now has four standard consumers — REST, GraphQL, MCP, LLM Context — with the relationship defined once. AI is not a special case that needs its own graph. It is just another reader of the same tree. Invest in your domain model, not in your prompt template. The longer the context window and the more complex the agent pipeline, the larger this dividend grows.