{"slug": "agent-framework-rag-for-agents-giving-your-agent-the-right-context", "title": "Agent Framework RAG for Agents: Giving Your Agent the Right Context", "summary": "A developer building on the Microsoft Agent Framework describes how to connect agents to private knowledge using RAG (Retrieval-Augmented Generation). The approach exposes retrieval as a controlled tool, SearchKnowledgeAsync, rather than giving the agent direct access to databases or all documents. The agent fetches relevant context only when needed, keeping the retrieval layer separate from the agent runtime.", "body_md": "This is Part 13 of my series on the Microsoft Agent Framework. You can read the original post over on[lukaswalter.dev].\n\nIn the [previous article](https://www.lukaswalter.dev/posts/agentframework_1_12/), we looked at workflows.\n\nWorkflows make sense when the process itself needs structure: state, checkpoints, events, human approvals, and resumable execution.\n\nThis post is the bridge from Agent Framework into RAG.\n\nI plan on doing a full RAG deep dive sometime later. The practical question for now is smaller:\n\nHow do I connect an Agent Framework agent to private application knowledge without stuffing every document into the prompt?\n\nFor agents, RAG is less about adding more text and more about giving the agent a controlled retrieval path.\n\nThe agent should fetch the right context at the point where it needs it.\n\nYour company documents, product catalog, tickets, rules, policies, runbooks, and internal knowledge base live outside the model.\n\nThe model has generic knowledge. Your application has private knowledge.\n\nTreat those as separate systems.\n\nYou can paste some private data into the prompt, and for a demo that may be enough.\n\nBut this falls apart quickly:\n\nThe last point is easy to underestimate.\n\nA larger context window lets you send more text.\n\nIt does not decide which text is correct, current, relevant, or permitted.\n\nDo not give the agent all knowledge.\n\nGive it the right context at the moment it needs it.\n\nRetrieval owns that job.\n\nThe basic RAG loop is small:\n\n``` php\nuser question\n-> retrieve relevant chunks\n-> pass chunks to the agent\n-> agent answers using that context\n```\n\nFor documents, the longer pipeline usually looks like this:\n\n``` php\ndocuments\n-> chunks\n-> embeddings\n-> vector store\n-> search\n-> retrieved context\n-> agent response\n```\n\nDocuments are split into smaller chunks.\n\nThose chunks are embedded into vectors.\n\nThe vectors and source metadata are stored.\n\nWhen a user asks a question, the question is embedded too.\n\nThe search layer finds nearby chunks and returns only those chunks to the agent.\n\nStop there for now.\n\nThere are some hard parts here:\n\nchunk boundaries, embedding model choice, hybrid search, reranking, freshness, access control, observability, and evals.\n\nThey are just not the point yet.\n\nFor now, keep the boundary clear:\n\nRAG is the retrieval layer around the agent.\n\nThe agent is not the retrieval layer.\n\nMicrosoft Agent Framework gives you the agent runtime.\n\nIt does not give you a finished ingestion pipeline, chunking strategy, embedding setup, vector store, ranking model, permission model, freshness process, or retrieval eval suite.\n\nAgent Framework helps you decide how the agent receives and uses context:\n\nThe retrieval system still belongs to your application architecture.\n\nIt might use Azure AI Search, [PostgreSQL with pgvector](https://www.lukaswalter.dev/posts/rag-efcore-pgvector/), SQL Server vector search, Cosmos DB, Qdrant, Redis, a normal search index, or an internal HTTP API.\n\nThe agent does not need to care.\n\nThe agent needs a focused capability.\n\nNot direct database access.\n\nFor many agent apps, I would start by exposing retrieval as a tool.\n\nThe tool is narrow:\n\n```\nSearchKnowledgeAsync(\n    string query,\n    string? category,\n    int limit)\n```\n\nThe agent can call it when the answer depends on private knowledge.\n\nYour application decides what the tool is allowed to search.\n\nThis matches the tool-design rule from earlier in the series:\n\nTools should expose controlled capabilities, not raw infrastructure.\n\nA small version looks like this:\n\n```\nusing System.ComponentModel;\nusing Microsoft.Agents.AI;\nusing Microsoft.Extensions.AI;\nusing Microsoft.Extensions.DependencyInjection;\n\npublic sealed record KnowledgeSearchResult(\n    string Title,\n    string Source,\n    string Snippet,\n    double Score);\n\npublic interface IKnowledgeSearch\n{\n    Task<IReadOnlyList<KnowledgeSearchResult>> SearchAsync(\n        string query,\n        string? category,\n        int limit,\n        CancellationToken cancellationToken);\n}\n\n[Description(\"Searches approved internal knowledge articles, policies, and runbooks.\")]\npublic static Task<IReadOnlyList<KnowledgeSearchResult>> SearchKnowledgeAsync(\n    [Description(\"Focused search query. Rewrite the user's message into search terms.\")]\n    string query,\n    [Description(\"Optional source category such as policy, runbook, product, support, or architecture.\")]\n    string? category,\n    [Description(\"Maximum number of results to return. Use 3 to 5 for normal questions.\")]\n    int limit,\n    IServiceProvider services,\n    CancellationToken cancellationToken)\n{\n    var search = services.GetRequiredService<IKnowledgeSearch>();\n\n    return search.SearchAsync(\n        query,\n        category,\n        Math.Clamp(limit, 1, 5),\n        cancellationToken);\n}\n```\n\nThe model supplies `query`\n\n, `category`\n\n, and `limit`\n\n.\n\nThe application supplies `IKnowledgeSearch`\n\n.\n\nKeep that split.\n\nThe model can ask for a search.\n\nIt does not get a connection string, a database client, or permission to browse every source.\n\nThen attach the tool to the agent:\n\n```\nAIAgent supportAgent = chatClient.AsAIAgent(\n    instructions: \"\"\"\n    You answer questions about the internal engineering platform.\n\n    Use SearchKnowledgeAsync when the answer depends on private company\n    documentation, runbooks, policies, known issues, or product rules.\n\n    If the search results do not contain enough evidence, say that the indexed\n    sources do not answer the question. Do not invent policy details, limits,\n    prices, permissions, or operational steps.\n    \"\"\",\n    tools: [AIFunctionFactory.Create(SearchKnowledgeAsync)],\n    services: app.Services);\n```\n\nThe agent-side RAG flow is:\n\nAt that point, retrieval is just another tool.\n\nThe pattern fits Agent Framework because tools already give you that controlled application boundary.\n\nUsers ask messy questions.\n\nFor example:\n\n```\nWhat were the most important changes in our cancellation policy last year?\n```\n\nA better retrieval query might be:\n\n```\ncancellation policy changes last year\n```\n\nOr, if you expose metadata filters:\n\n```\nawait SearchKnowledgeAsync(\n    query: \"cancellation policy changes last year\",\n    category: \"policy\",\n    limit: 5,\n    services,\n    cancellationToken);\n```\n\nThe agent can help here.\n\nIt can translate a conversational request into a smaller retrieval query.\n\nBut do not overcomplicate this too early.\n\nStart by logging the generated tool query and checking whether it actually finds better results than the raw user message.\n\nBad query rewriting is worse than no query rewriting.\n\nIt can remove the term that mattered.\n\nVector similarity finds related text.\n\nIt does not know whether that text belongs to the right tenant, product, language, version, source system, or user permission scope.\n\nYou often need filters.\n\nCommon filters include:\n\nSome filters can be model supplied.\n\n`category`\n\nis a reasonable example because the model can often infer whether a question is about a policy, runbook, product, or support article.\n\nSome filters should not be model supplied.\n\nTenant, user ID, role, entitlement, and document permissions should come from your authenticated application context.\n\nThe model should not be allowed to say:\n\n```\nSearch tenant = admin\n```\n\nand suddenly see admin-only documents.\n\nA better application boundary looks like this:\n\n```\npublic interface IKnowledgeSearch\n{\n    Task<IReadOnlyList<KnowledgeSearchResult>> SearchAsync(\n        string query,\n        string? category,\n        int limit,\n        UserKnowledgeScope scope,\n        CancellationToken cancellationToken);\n}\n```\n\nThe tool can accept the search query and category.\n\nYour application adds `UserKnowledgeScope`\n\nfrom the current user.\n\nSimilarity search finds related text.\n\nMetadata filters keep the search inside the right boundary.\n\nExposing retrieval as a tool is not the only option.\n\nFor a pure documentation assistant, you may not want the model to decide whether to search.\n\nYou may want retrieval on every request.\n\nPlain application code is enough:\n\n```\nIReadOnlyList<KnowledgeSearchResult> results =\n    await knowledgeSearch.SearchAsync(\n        query: userQuestion,\n        category: null,\n        limit: 5,\n        cancellationToken);\n\nstring context = string.Join(\n    \"\\n\\n\",\n    results.Select(result => $\"\"\"\n    Source: {result.Title}\n    {result.Snippet}\n    \"\"\"));\n\nAgentResponse response = await supportAgent.RunAsync($\"\"\"\n    Answer the user's question using the retrieved context.\n    If the context is not enough, say so.\n\n    Retrieved context:\n    {context}\n\n    User question:\n    {userQuestion}\n    \"\"\",\n    cancellationToken: cancellationToken);\n```\n\nYou can also use Agent Framework context providers, such as `TextSearchProvider`\n\n, when that fits your setup.\n\nThe tradeoff is the same either way:\n\nIf almost every request needs private knowledge, retrieve before the agent call.\n\nIf retrieval is one capability among several, expose it as a tool.\n\nRAG is for finding relevant context.\n\nCode is for exact operations.\n\nIf the user asks:\n\n```\nWhat are the top 5 products by revenue?\n```\n\nthat should probably be SQL or an analytics API, not vector search.\n\nThe same applies to:\n\nVector search is good at finding related text.\n\nIt is not a calculator, database constraint, authorization system, or reporting engine.\n\nIf the answer must be exact, use normal code behind a tool.\n\nFor example:\n\n```\n[Description(\"Returns the top products by revenue for an authorized reporting period.\")]\npublic static Task<IReadOnlyList<ProductRevenue>> GetTopProductsByRevenueAsync(\n    DateOnly from,\n    DateOnly to,\n    int limit,\n    IServiceProvider services,\n    CancellationToken cancellationToken)\n{\n    var reporting = services.GetRequiredService<IRevenueReporting>();\n\n    return reporting.GetTopProductsByRevenueAsync(\n        from,\n        to,\n        Math.Clamp(limit, 1, 20),\n        cancellationToken);\n}\n```\n\nThis still gives the agent a tool.\n\nIt is just not RAG.\n\nUse retrieval with an Agent Framework agent when:\n\nStart with a narrow search tool.\n\nLog the query the agent sends.\n\nLog the sources returned.\n\nCheck whether the answer actually used those sources.\n\nThat gives you enough signal to see where the retrieval design is weak.\n\nDo not use RAG when the task needs deterministic data access or computation.\n\nUse normal code for current state, totals, rankings, exact IDs, prices, permissions, and business rules.\n\nDo not use RAG as a way to bypass application boundaries.\n\nIf a user cannot access a document in the product, the retrieval tool should not return it to the agent.\n\nAlso avoid building the full ingestion and retrieval platform before you have a real use case.\n\nStart with one domain, a small corpus, and a handful of questions you can verify.\n\nAgent Framework gives you a clean place to put retrieval into the agent loop.\n\nIt does not make RAG automatic.\n\nThe design I would carry forward is simple:\n\nAs I said before, I will do a deep dive into RAG later on. So in the next Agent Framework post we will move to multimodal agents: images, PDFs, and provider differences.\n\nThe agent boundary gets messy there in a different way.\n\nSome providers can work with images or document inputs natively, some need different message formats, and some scenarios are still better handled by manual preprocessing before the agent sees anything.", "url": "https://wpnews.pro/news/agent-framework-rag-for-agents-giving-your-agent-the-right-context", "canonical_source": "https://dev.to/lukaswalter/agent-framework-rag-for-agents-giving-your-agent-the-right-context-1n15", "published_at": "2026-06-18 15:30:00+00:00", "updated_at": "2026-06-18 15:51:20.038386+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "developer-tools", "natural-language-processing", "machine-learning"], "entities": ["Microsoft Agent Framework", "Azure AI Search", "PostgreSQL", "pgvector", "SQL Server", "Cosmos DB", "Qdrant", "Redis"], "alternates": {"html": "https://wpnews.pro/news/agent-framework-rag-for-agents-giving-your-agent-the-right-context", "markdown": "https://wpnews.pro/news/agent-framework-rag-for-agents-giving-your-agent-the-right-context.md", "text": "https://wpnews.pro/news/agent-framework-rag-for-agents-giving-your-agent-the-right-context.txt", "jsonld": "https://wpnews.pro/news/agent-framework-rag-for-agents-giving-your-agent-the-right-context.jsonld"}}