{"slug": "agentic-search-for-context-engineering", "title": "Agentic Search for Context Engineering", "summary": "Leonie Monigatti presented a workshop titled \"Agentic Search for Context Engineering\" at AI Engineer Europe 2026 in London on April 8, 2026, arguing that context engineering is 80% agentic search. The workshop traced the evolution from fixed retrieval pipelines in Retrieval-Augmented Generation (RAG) to agent-driven search tools, and now to context engineering, which manages multiple context sources like local files, databases, and web search. Monigatti made the case that the process of curating what goes into an agent's context window is critical for generating meaningful responses.", "body_md": "This post is an edited long-form version of the workshop titled “Agentic Search for Context Engineering” I gave at [AI Engineer Europe 2026](https://www.ai.engineer/europe/2026) on April 8, 2026 in London. The slides, code, and diagrams are available in the [workshop repository](https://github.com/iamleonie/workshop-agentic-search). If you prefer video, the full recording is available on YouTube:\n\nIf you’ve built an agent before, you know that context engineering is a cruical part of making an agent return meaningful responses.\n\n**Context engineering** is the process of deciding what from all possible context sources actually goes into the agent’s context window so the LLM can generate the best response. This is also referred to as the process of “context curation”, which is depicted as the arrow from the possible context sources to the context window in the diagram below.\n\nI think we don’t give this context curation arrow enough credit because it does almost all of the heavy lifting. What hides behind it is a set of search tools the agent can decide to use. That’s why my personal hot take is that:\n\nContext engineering is about 80% agentic search.\n\n## History\n\nLet’s take a step back and look at how search in the AI stack has evolved from [Retrieval-Augmented Generated (RAG)](../blog/retrieval-augmented-generation-langchain.html), to agentic search/agentic RAG, and now to context engineering in the last three years. (If you’re interested I’ve also written about the [evolution from RAG to agent memory](../blog/from-rag-to-agent-memory.html).)\n\n### Retrieval-Augmented Generation\n\nWhen we first started building with LLMs in 2024, we started implementing RAG systems with fixed retrieval pipelines:\n\n- The user message is used - more or less verbatim - as a search query (often vector search).\n- Chunks are pulled from a database once.\n- Retrieved chunks are combined with the user message in the prompt and fed to the LLM.\n\nThis design is straightforward and still works for narrow Q&A. But it also breaks in predictable ways because you retrieve exactly *once*:\n\n- You retrieve once even when the model does not need external context, and irrelevant chunks can confuse it.\n- You retrieve once and have no option to correct the query. What if the returned results don’t contain the relevant information and you’d need to run a second search?\n- You retrieve once even when the question needs multi-hop retrieval. For example, the first batch of chunks might only tell you what to search for next, but the pipeline never runs a second pass.\n\n### Agentic RAG\n\nTo overcome these limitations, we replaced the fixed pipeline with a search tool and called it “agentic RAG”, “agentic retrieval”, or “agentic search”. In this scenario, the agent decides whether to call a search tool, whether results are relevant, whether to retrieve more, and whether to rewrite the search query.\n\nIn many setups this is straightforward because you often have only one or at least a limited amount of context sources and one retrieval tool.\n\n### Context engineering\n\nIn context engineering we now have many different context sources:\n\n- Local files (e.g., your repo, scratch pads with\n`plan.md`\n\nor`todo.md`\n\nfiles, or Agent Skills) on disk, - Databases (e.g., storing large-scale enterprise data),\n- Web,\n- Long-term memory (I left this ambiguous on purpose because\n[whether memory lives in files or a database is still debated](filesystem-vs-database-for-agent-memory.qmd))\n\nAnd depending on the context source, we often have a native search tool:\n\n- file search for local files,\n- skill loading for Agent Skills,\n- dedicated database tools (semantic search or query execution) for databases,\n- web search for the web, and\n- memory tools for long-term memory.\n\nIf that’s not overwhelming enough, we now also have a tool that lets the agent run terminal commands. This tool has many different names: [LangChain calls it the “shell tool”](https://reference.langchain.com/python/langchain-community/tools/shell/tool/ShellTool), [Anthropic the “bash tool”](https://platform.claude.com/docs/en/agents-and-tools/tool-use/bash-tool), [OpenClaw the “exec tool”](https://docs.openclaw.ai/tools/exec). In the following, we will refer to it as the **“shell tool”**.\n\nThe shell tool is a versatile tool because it can interact with most context sources: It can run against local files (`ls`\n\n, `grep`\n\n), databases (CLIs, scripts, `curl`\n\nto HTTPS APIs), and the web (`curl`\n\n).\n\nThat’s why currently, there’s been a lot of discussion whether a shell tool is all an agent needs. If you’re interested in a deep dive into this discussion, I’ve written about it in this blog titled [“The shell tool is not a silver bullet for context engineering”](https://www.elastic.co/search-labs/blog/search-tools-context-engineering). The TLDR of that blog and the core topic of this workshop is that the practical question is **which search tools belong in your stack**, not shell tool versus all others.\n\nIf you take away one thing from this workshop, let it be this:\n\nDoing good search is difficult. That is why we have so many different search techniques and why you curate a stack for your latency and quality requirements.\n\nIn this workshop, we take a look at a small selection of search tools to give you an intuition on their strengths and weaknesses.\n\n## Fundamentals of building search tools\n\nBefore we dive into some concrete search tools, let’s review some fundamentals of building effective search tools.\n\n### Challenges of agentic search\n\nIn theory agentic search looks simple:\n\n- The user makes a request\n- The agent calls the right tool with the right parameters\n- The tool returns search results\n- The agent responds with the correct answer.\n\nBut in reality, this can break in many different ways. At Elastic we help teams build agents on Elasticsearch data and these are the three most common failure modes we observe:\n\n- The agent doesn’t call any tool at all and answers from parametric knowledge.\n- The agent calls the wrong tool (e.g., web search instead of the company index).\n- The agent calls the right tool with wrong parameters.\n\nLet’s discuss some general best practices to overcome these challenges, so that when I demonstrate them to you in the next section, you will have an intuition on what to do. If you’re interested in more best practices, I’ve also written a guide on [what we’ve learned about building effective database retrieval tools at Elastic](https://www.elastic.co/search-labs/blog/database-retrieval-tools-context-engineering).\n\n### Tool descriptions\n\nI don’t like this slide because I feel like everyone knows that the **tool description is the most important part of any tool**. However, I’m going to quickly give you a refresher on it because any time I see a tool description, it’s the least effort one-line description and you’re wondering why your agent is failing to use it.\n\nAdmittedly, this template is quite long. I’m not saying that you need to follow this template at all times. Instead, start with a clear core purpose and trigger conditions. And when you notice your agent is struggling to use your tool, start adding more components of this template, such as actions, relationships, limitations, and examples.\n\nIf that is still not enough, repeat the same rules in the system prompt.\n\n### Parameter complexity\n\nAnother aspect to be aware of is parameter complexity. Because once the right tool is selected, the agent still has to fill in the arguments and some are easier to get right than others.\n\nFor example, the parameter for a tool `get_customer_by_id`\n\nis easy when the ID is in the message. Or a semantic search tool that takes in any string as a search query is usually fine. But when you start adding more parameters, such as filters or `top_k`\n\n, it can become more challenging for the LLM to generate valid parameters. But what if you have a general-purpose search tool that lets your agent write full SQL or [ES|QL](https://www.elastic.co/docs/reference/query-languages/esql) queries from scratch as the parameter? Most models are decent at generating valid SQL queries, but it increases the risk of failures.\n\n## Code walkthrough\n\nLet’s look at a small selection of search tool implementations in practice. (I’d love to show you all of them but time is unfortunately limited.) You can find the full code in the [workshop repository](https://github.com/iamleonie/workshop-agentic-search).\n\nThis code walkthrough uses [LangChain](https://docs.langchain.com/) for orchestration because it abstracts a lot of complexity away, so that we can focus on the high level concepts, and also because it provides a shell tool and other useful things out of the box.\n\nFor the demo, we use the [AI Engineer Europe 2026 schedule](https://www.ai.engineer/europe/schedule) dataset to search over.\n\nThe first two demos use a local Elasticsearch cluster, which you can spin up using the [start-local](https://www.elastic.co/guide/en/elasticsearch/reference/current/run-elasticsearch-locally.html) script. The third reads the same data from disk. To focus on the core mechanics of the search tools, we will skip over the exact data preparation and ingestion steps, which are available in the notebook [00_prepare_data.ipynb](https://github.com/iamleonie/workshop-agentic-search/blob/main/notebooks/00_prepare_data.ipynb).\n\n### Vanilla agentic search\n\nLet’s refresh our memory on the minimal vanilla agentic search setup implemented in the first notebook [01_vanilla_agentic_search.ipynb](https://github.com/iamleonie/workshop-agentic-search/blob/main/notebooks/01_vanilla_agentic_search.ipynb) first. It consists of one semantic search tool over the index `conference_schedule`\n\n, which contains one document per session.\n\nFirst, we set up an LLM, which is the core of any agent. Here we use `gpt-5.4-nano`\n\nthrough LiteLLM:\n\n``` python\nfrom langchain_openai import ChatOpenAI\n\nllm = ChatOpenAI(\n    openai_api_base=LITELLM_API_BASE,\n    api_key=LITELLM_API_KEY,\n    model=\"llm-gateway/gpt-5.4-nano\",\n    temperature=0.5,\n)\n```\n\nThen we define a minimal system prompt, which tells the agent that it is a search agent, when to retrieve, and how the index is shaped:\n\n```\nYou are a search agent tasked with answering questions about the AI Engineer Europe 2026 Conference.\n\nYou have access to different context retrieval tools to help you answer user queries.\n\nBefore answering a question decide whether or not you need to retrieve additional context to answer the question correctly.\nIf the retrieved context does not contain relevant information to answer the query, say that you don't know. \n\n## Elasticsearch (`conference_schedule`)\n\nThe conference sessions are indexed in Elasticsearch. One document per session.\n\n| Field | Description |\n|--------------|------------|\n| `text` | The string that was embedded: each session’s title plus description (blank line between them). It does not include day, time, room, or speakers. |\n| `vector` | Dense embedding of `text`, used for similarity search. |\n| `metadata` | Structured fields (see metadata description below) |\n\n| Field | Description |\n|--------------|------------|\n| `metadata.title` | Title of the session|\n| `metadata.day` | Date of the session (Example format: April 10) |\n| `metadata.time` | Time slot of the session (Example format: 12:40-1:00pm) |\n| `metadata.room` | Room where the session takes place |\n| `metadata.type` | One of 'keynote', 'workshop', 'talk', 'track_keynote', 'lightning', 'expo_session' |\n| `metadata.track` | Track |\n| `metadata.speakers` | Name(s) of the speaker(s) as a single comma-separated string |\n```\n\nNext, we prepare the Elasticsearch client. For this, we prepare `jina-embeddings-v5-text-small`\n\nas the embedding model, which embeds queries at search time and connect to our `ElasticsearchStore`\n\n:\n\n``` python\nfrom langchain_community.embeddings import JinaEmbeddings\nfrom langchain_elasticsearch import ElasticsearchStore\n\nembeddings = JinaEmbeddings(\n    jina_api_key=JINA_API_KEY,\n    model_name=\"jina-embeddings-v5-text-small\",\n)\n\nvector_store = ElasticsearchStore(\n    es_url=ES_URL,\n    es_user=ES_USERNAME,\n    es_password=ES_PASSWORD,\n    index_name=\"conference_schedule\",\n    embedding=embeddings,\n)\n```\n\nHere comes the interesting part: Let’s define the search tool, which takes in a search query, runs a semantic search, and returns the top 3 results.\n\nLangChain’s `@tool`\n\ndecorator lets us convert any Python function into an agent tool and uses the function name as the tool name and the docstring as the description by default.\n\nAs you can see below, I already broke [my own recommendation on tool descriptions from above](#tool-descriptions) because I’m using only a short docstring here. In this case, however, that’s enough because we only have one tool and a vanilla demo use case.\n\n``` python\nfrom langchain.tools import tool\n\n@tool()\ndef semantic_search_conference_sessions(query: str) -> str:\n    \"\"\"Runs a semantic search query to find conference sessions by concept or topic.\n\n    Args:\n        query: The topic or concept used as search query to search for.\n\n    Returns:\n        A string containing the information about the sessions.\n    \"\"\"\n    docs = vector_store.similarity_search(query, k=3)\n    return \"\\n\\n\".join(\n        f\"**{doc.metadata['type']} by {doc.metadata['speakers']}**: \"\n        f\"{doc.metadata['title']}\\nDescription:{doc.page_content}\"\n        for doc in docs\n    )\n```\n\nNow, we have all the components that make up an AI agent (usually, you’d also have “memory”, but it is omitted here on purpose to focus on the search tools) and all we have to do is to plug them together.\n\n``` python\nfrom langchain.agents import create_agent\n\nagent = create_agent(\n    model=llm,\n    system_prompt=SYSTEM_PROMPT,\n    tools=[semantic_search_conference_sessions],\n)\n```\n\nFor an example semantic search query *“Which sessions discuss regulatory constraints in AI systems?”* the agent now calls the tool and correctly finds [Bilge Yücel’s talk on engineering AI under sovereignity constraints](https://www.youtube.com/watch?v=x2bH0RKPgdc).\n\n```\n================================ Human Message =================================\n\nWhich sessions discuss regulatory constraints in AI systems?\n================================== Ai Message ==================================\nTool Calls:\n  semantic_search_conference_sessions (call_FE570VnExWZOmupAtjduUcIB)\n Call ID: call_FE570VnExWZOmupAtjduUcIB\n  Args:\n    query: regulatory constraints in AI systems law regulation compliance governance EU AI Act policy constraints\n================================= Tool Message =================================\nName: semantic_search_conference_sessions\n\n**talk by Bilge Yücel**: Engineering AI Systems Under Sovereignty Constraints\nDescription: Regulatory and jurisdictional constraints are no longer an edge case in AI system design; [...]\n\n**talk by Dan James**: Building the Justice AI Unit: Shipping Production AI Inside Government\nDescription: [...]\n\n**workshop by Liam McGarrigle**: Building Your Own Secure AI Workflows: Human-in-the-Loop Automation with n8n\nDescription: [...]\n================================== Ai Message ==================================\n\nThe conference sessions that discuss regulatory constraints / legal or jurisdictional limitations on AI systems are:\n\n- **Bilge Yücel — “Engineering AI Systems Under Sovereignty Constraints”** (talk)  \n  Focus: how legal/jurisdictional limits affect AI architecture—e.g., **data localization, auditability, model transparency, and jurisdictional control**.\n\n- **Dan James — “Building the Justice AI Unit: Shipping Production AI Inside Government”** (talk)  \n  Focus: building and deploying production AI **inside government**, where regulatory/operational constraints apply.\n\nIf you want, tell me whether you mean **EU AI Act–specific compliance** or **broader regulation/governance**, and I can narrow it further.\n```\n\nThis is where most agentic search demos stop.\n\nThis is also where it starts to get interesting.\n\nTo break this demo that’s built on a semantic search tool, all you have to do is to ask it something where a keyword-based search would be more suitable than a semantic search. For example, if I ask *“Which sessions should I visit to learn more about GEPA?”*, the semantic search returns unrelated sessions, but not the correct [one by Samuel Colvin](https://www.youtube.com/watch?v=A48uhxfxbsM).\n\n```\n================================ Human Message =================================\n\nWhich sessions should I visit to learn more about GEPA?\n================================== Ai Message ==================================\nTool Calls:\n  semantic_search_conference_sessions (call_fV9IHwnZVgqM8RPdfyT9pNr8)\n Call ID: call_fV9IHwnZVgqM8RPdfyT9pNr8\n  Args:\n    query: GEPA\n================================= Tool Message =================================\nName: semantic_search_conference_sessions\n\nkeynote by Omar Sanseviero: Gemma, DeepMind's Family of Open Models\nDescription:Gemma, DeepMind's Family of Open Models\n\nGoogle DeepMind’s Gemma family is expanding. Join us for a deep dive into the latest models of the Gemma ecosystem. From vibe fine-tuning to Sovereign AI, you'll learn about the latest model capabilities, how to build high-performance applications, and how to get started with open models.\n\ntrack_keynote by Ryan Lopopolo: Harness Engineering AMA\nDescription:Harness Engineering AMA\n\ntalk by Dan James: Building the Justice AI Unit: Shipping Production AI Inside Government\nDescription:Building the Justice AI Unit: Shipping Production AI Inside Government\n\nBuilding the Justice AI Unit (https://ai.justice.gov.uk/) — forward-deployed engineers, entrepreneurial model, shipping production AI every day, but inside government.\n================================== Ai Message ==================================\n\nI searched the conference schedule for **“GEPA”**, but the only matching items I found were about **DeepMind’s “Gemma”** (an open models family), not GEPA itself. So I **don’t know which sessions are specifically about GEPA** based on the available schedule data.\n\nIf you meant **Gemma** (DeepMind), the relevant session is:\n- **Keynote — “Gemma, DeepMind's Family of Open Models”** (Omar Sanseviero)\n\nCan you confirm what **GEPA** stands for (or share a speaker/session name)? Then I can re-search and give you the right sessions.\n```\n\n### Agentic search with general-purpose database search tool\n\nWhat if the agent could fully autonomously write entire search queries from scratch? Let’s replace the narrow scoped semantic search tool with a more general purpose tool (`execute_esql_query`\n\n) for this as implemented in the notebook [02_agentic_search_with_db_query_tool.ipynb](https://github.com/iamleonie/workshop-agentic-search/blob/main/notebooks/02_agentic_search_with_db_query_tool.ipynb).\n\nFirst, we switch from `gpt-5.4-nano`\n\nto `gpt-5.4-mini`\n\nbecause query generation is harder than passing a topic.\n\nThen, we define a general-purpose database query tool that let’s the agent run full ES|QL queries against Elasticsearch.\n\n``` php\n@tool()\ndef execute_esql_query(esql_query: str) -> str:\n    \"\"\"Execute an ES|QL query against the conference_schedule index in Elasticsearch.\n\n    Args:\n        esql_query: The ES|QL query to execute\n\n    Returns:\n        A string containing the information about the sessions.\n    \"\"\"\n    try:\n        response = es_client.esql.query(query=esql_query, format=\"csv\")\n        return response.body\n    except Exception as e:\n        return f\"Error executing ES|QL query: {e}\"\n```\n\nOne thing to note is that we also add a try/except block here that returns some meaningful information if the ES|QL query throws an error, so the agent can self-correct.\n\nNow, when we run the same question *“Which sessions should I visit to learn more about GEPA?”* as earlier, we can see that the agent generates something that looks like valid ES|QL at first glance:\n\n```\nFROM conference_schedule\n| WHERE text LIKE '%GEPA%'\n| KEEP metadata.speakers, metadata.title, text\n| LIMIT 3\n```\n\nHowever, this tool call returns zero search results because `%GEPA%`\n\n, which uses `%`\n\nwildcards, which is a SQL habit. ES|QL instead uses `*`\n\nfor wildcards.\n\n*Note, that you should also think about how to handle zero search results. Decide in your product whether zero rows means “nothing found” or “retry with different parameters.”*\n\nThere are many options to overcome this, such as improving the instructions or adding ES|QL specific agent skills. For this tutorial, we will define a minimal skill. For production, I recommend to checkout [Elastic Agent Skills](https://github.com/elastic/agent-skills).\n\n``` python\nfrom typing import TypedDict\n\nclass Skill(TypedDict):\n    name: str\n    description: str\n    content: str\n\nSKILLS: list[Skill] = [\n    {\n        \"name\": \"elasticsearch-esql\",\n        \"description\": \"\"\"Execute ES|QL (Elasticsearch Query Language) queries, use when the user wants to\n  query Elasticsearch data, analyze logs, aggregate metrics, explore data, or create\n  charts and dashboards from ES|QL results.\"\"\",\n        \"content\": \"\"\"\n# Elasticsearch ES|QL\n...\n### Pattern Matching\n- `LIKE` Wildcard pattern (`*` zero or more chars, `?` single char)\n...\nES|QL uses **double quotes** for string literals, never single quotes.\n\"\"\",\n    },\n]\n```\n\nTo use the Agent Skills, we will copy the [Basic implementation for Agent Skills from the LangChain documentation](https://docs.langchain.com/oss/python/langchain/multi-agent/skills#basic-implementation) for a skill loading tool `load_skill`\n\nand a `SkillMiddleware`\n\n, which injects skill descriptions and registers `load_skill`\n\n.\n\n``` php\n@tool\ndef load_skill(skill_name: str) -> str:\n    \"\"\"Load the full content of a skill into the agent's context.\"\"\"\n    for skill in SKILLS:\n        if skill[\"name\"] == skill_name:\n            return f\"Loaded skill: {skill_name}\\n\\n{skill['content']}\"\n    available = \", \".join(s[\"name\"] for s in SKILLS)\n    return f\"Skill '{skill_name}' not found. Available skills: {available}\"\n```\n\nNow we also have to tell the agent to call the ES|QL skill before calling the `execute_esql_query`\n\ntool. For this we adjust the tool description by adding this relationship:\n\n``` php\n@tool()\ndef execute_esql_query(esql_query: str) -> str:\n    \"\"\"Execute an ES|QL query against an index in Elasticsearch.\n    Always use the Elasticsearch ES|QL skill to generate the ES|QL query\n    before using this tool to execute the query.\n    ...\n    \"\"\"\n```\n\nAnd we re-enforce this in the system prompt:\n\n```\nSYSTEM_PROMPT_WITH_SKILL = SYSTEM_PROMPT + (\n    \"If you need to execute an ES|QL query, use the Elasticsearch ES|QL skill \"\n    \"to generate the query before using this tool to execute the query. \"\n    \"If an ESQL query returns an error use the Elasticsearch ES|QL skill \"\n    \"to generate a new query.\"\n)\n```\n\nNow we plug in all components including the `SkillMiddleware`\n\n:\n\n```\nagent = create_agent(\n    model=llm,\n    system_prompt=SYSTEM_PROMPT_WITH_SKILL,\n    middleware=[SkillMiddleware()],\n    tools=[execute_esql_query],\n)\n```\n\nThe traces now show that the agent first calls the ES|QL skill and uses it to write valid ES|QL queries when calling the `execute_esql_query`\n\ntool. The results show that this way the agent is able to find the correct session.\n\n```\n================================ Human Message =================================\n\nWhich sessions should I visit to learn more about GEPA?\n================================== Ai Message ==================================\nTool Calls:\n  load_skill (call_pUblqD2S7iLb6hjEaI7xZZJr)\n Call ID: call_pUblqD2S7iLb6hjEaI7xZZJr\n  Args:\n    skill_name: elasticsearch-esql\n================================= Tool Message =================================\nName: load_skill\n\nLoaded skill: elasticsearch-esql\n\n# Elasticsearch ES|QL\n\nExecute ES|QL queries against Elasticsearch.\n\n## What is ES|QL?\n\nES|QL (Elasticsearch Query Language) is a piped query language for Elasticsearch. It is **NOT** the same as SQL.\n\n[...]\n\n================================== Ai Message ==================================\nTool Calls:\n  execute_esql_query (call_rYOHyKLoB9vsAF77NKGJD9vr)\n Call ID: call_rYOHyKLoB9vsAF77NKGJD9vr\n  Args:\n    esql_query: FROM conference_schedule\n| WHERE text LIKE \"*GEPA*\" OR metadata.title LIKE \"*GEPA*\"\n| KEEP metadata.title, metadata.day, metadata.time, metadata.room, metadata.type, metadata.track, metadata.speakers\n| SORT metadata.day, metadata.time\n================================= Tool Message =================================\nName: execute_esql_query\n\nmetadata.title,metadata.day,metadata.time,metadata.room,metadata.type,metadata.track,metadata.speakers\nPlayground in Prod - Optimising Agents in Production Environments,April 8,10:40am-12:00pm,Westminster,workshop,,Samuel Colvin\n================================== Ai Message ==================================\n\nThe session I found related to GEPA is:\n\n- **Playground in Prod - Optimising Agents in Production Environments**  \n  - **When:** April 8, 10:40am–12:00pm  \n  - **Where:** Westminster  \n  - **Type:** Workshop  \n  - **Speaker:** Samuel Colvin\n\nIf you want, I can also look for sessions about **agents**, **optimization**, or **production environments** that may be relevant to GEPA even if they don’t mention it by name.\n```\n\nSince this tool is general-purpose, the agent can also use it for **analytical questions**, such as *“How many sessions are on April 8?”* by using `STATS`\n\n. This is especially useful because otherwise, the agent would have to filter for all sessions on April 8th, and then count them. And we all know how notoriously bad LLMs are at counting.\n\n```\n================================ Human Message =================================\n\nHow many sessions are on April 8th?\n================================== Ai Message ==================================\nTool Calls:\n  load_skill (call_5Ki25LYVnkx245nEDSM3xwwz)\n Call ID: call_5Ki25LYVnkx245nEDSM3xwwz\n  Args:\n    skill_name: elasticsearch-esql\n================================= Tool Message =================================\nName: load_skill\n\nLoaded skill: elasticsearch-esql\n\n# Elasticsearch ES|QL\n\nExecute ES|QL queries against Elasticsearch.\n\n## What is ES|QL?\n\nES|QL (Elasticsearch Query Language) is a piped query language for Elasticsearch. It is **NOT** the same as SQL.\n\n[...]\n\n================================== Ai Message ==================================\nTool Calls:\n  execute_esql_query (call_0TSZEVO7uRGKGj2sfOE2VcEV)\n Call ID: call_0TSZEVO7uRGKGj2sfOE2VcEV\n  Args:\n    esql_query: FROM conference_schedule | WHERE metadata.day == \"April 8\" | STATS count = COUNT()\n================================= Tool Message =================================\nName: execute_esql_query\n\ncount\n27\n================================== Ai Message ==================================\n\nThere are **27 sessions** on **April 8th**.\n```\n\nAs you can see, the general-purpose `execute_esql_query`\n\nis more powerful than the narrow scoped `semantic_search`\n\ntool because it can handle more ambiguous or complex queries. At the same time, adding a skill loading tool adds cost and latency and it requires a more powerful LLM.\n\n### Agentic search with shell tool\n\nFinally, let’s talk about the hottest topic in agentic search at the moment: “Bash + Filesystem is all you need”.\n\nYou can find the related implementation in [03_agentic_search_with_shell_tool.ipynb](https://github.com/iamleonie/workshop-agentic-search/blob/main/notebooks/03_agentic_search_with_shell_tool.ipynb). For this, we move the same data from the Elasticsearch cluster to local files under `../data/session_data/`\n\n(one `.txt`\n\nper session) and let the agent search over them with LangChain’s `ShellTool`\n\n.\n\nFirst, we instantiate the shell tool:\n\n``` python\nfrom langchain_community.tools import ShellTool\n\nshell_tool = ShellTool()  # no safeguards by default; sandbox in production\n```\n\n*Note that the shell tool doesn’t have any safeguards by default and it is recommended to use it only in a sandboxes environment.*\n\nThe system prompt for this example is similar to the ones described previously, expect that we now describe the data in the filesystem instead of the Elasticsearch cluster:\n\n```\nYou are a search agent tasked with ...\n\n## Local filesystem (`session_data`)\n\nThe conference sessions are available under under `../data/session_data/`. One file per session.\n\nFile structure:\n```\n../\ndata/\n └── session_data/\n  ├── workshop/\n  │ ├── <title>.txt\n  │ └── ... \n  ├── ...\n  └── expo_session/\n```\n\nFile Sample:\n```\n# <Title>\n\n- **Day:** <Date of the session>\n- **Time:** <Time slot of the session>\n- **Room:** <Room where the session takes place>\n- **Type:** <One of 'keynote', 'workshop', 'talk', 'track_keynote', 'lightning', 'expo_session'>\n- **Speakers:** <Name(s) of the speaker(s)>\n\n<Description>\nFor *“Are there any sessions about GEPA?”* the agent now first explores the filesystem and then simply `grep`\n\ns for “GEPA” and finds the right session.\n\n```\n================================ Human Message =================================\n\nAre there any sessions about GEPA?\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_w0Yi8vNRxFy8EgbJsvBOqSJI)\n Call ID: call_w0Yi8vNRxFy8EgbJsvBOqSJI\n  Args:\n    commands: ['ls -R ../data/session_data | head', 'grep -Ril \"GEPA\" ../data/session_data | head -n 50']\nExecuting command:\n ['ls -R ../data/session_data | head', 'grep -Ril \"GEPA\" ../data/session_data | head -n 50']\n================================= Tool Message =================================\nName: terminal\n\nexpo_session\nkeynote\nlightning\ntalk\ntrack_keynote\nworkshop\n\n../data/session_data/expo_session:\nComprehend First, Code Later The AI Skill I Rely On Daily.txt\nExpo Session – AMA AIE Europe with swyx.txt\n../data/session_data/workshop/Playground in Prod - Optimising Agents in Production Environments.txt\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_s1QbL16Xg8p1RRxNhciTys9F)\n Call ID: call_s1QbL16Xg8p1RRxNhciTys9F\n  Args:\n    commands: ['grep -Ril \"GEPA\" ../data/session_data']\nExecuting command:\n ['grep -Ril \"GEPA\" ../data/session_data']\n================================= Tool Message =================================\nName: terminal\n\n../data/session_data/workshop/Playground in Prod - Optimising Agents in Production Environments.txt\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_WpzxO7kadUuXmZLpfUvRt66E)\n Call ID: call_WpzxO7kadUuXmZLpfUvRt66E\n  Args:\n    commands: ['cat \"../data/session_data/workshop/Playground in Prod - Optimising Agents in Production Environments.txt\" | sed -n \\'1,200p\\'']\nExecuting command:\n ['cat \"../data/session_data/workshop/Playground in Prod - Optimising Agents in Production Environments.txt\" | sed -n \\'1,200p\\'']\n================================= Tool Message =================================\nName: terminal\n\n# Playground in Prod - Optimising Agents in Production Environments\n\n[...]\n\n**Description:** Deploying an agent is just the beginning. [...]\n================================== Ai Message ==================================\n\nYes—there is at least one session that mentions **GEPA**:\n\n- **Playground in Prod - Optimising Agents in Production Environments** (workshop)  \n  - **Day/Time:** April 8, 10:40am–12:00pm  \n  - **Room:** Westminster  \n  - **Speaker:** Samuel Colvin  \n  - **Where GEPA appears:** The session includes **“Autonomous optimisation with GEPA”** and explains how GEPA can be wired into Logfire’s managed variables to create an optimisation loop that improves agent prompts based on production traces.\n```\n\nBut now let’s try it with the semantic search query *“Which sessions discuss handling regulatory constraints?”* from earlier to see **how agents can cheat at semantic search with grep**. You can see that the agent starts by looking for “regulat”, which I guess is fair because that way it can find terms like “regulatory” and “regulation”. But then it starts together related synonyms (\n\n`compliance`\n\n, `constraints`\n\n, `GDPR`\n\n, `governance`\n\n) until something hits.\n\n```\n================================ Human Message =================================\n\nWhich sessions discuss handling regulatory constraints?\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_b57oFP6PhsfwyUEepsuXQvCV)\n Call ID: call_b57oFP6PhsfwyUEepsuXQvCV\n  Args:\n    commands: ['ls -R ../data/session_data | head', 'find ../data/session_data -type f | wc -l', 'grep -Ril \"regulat\" ../data/session_data | head -n 20', 'grep -Ril \"compliance\\\\|constraint\\\\|constraints\\\\|GDPR\\\\|governance\" ../data/session_data | head -n 20']\nExecuting command:\n ['ls -R ../data/session_data | head', 'find ../data/session_data -type f | wc -l', 'grep -Ril \"regulat\" ../data/session_data | head -n 20', 'grep -Ril \"compliance\\\\|constraint\\\\|constraints\\\\|GDPR\\\\|governance\" ../data/session_data | head -n 20']\n================================= Tool Message =================================\nName: terminal\n\nexpo_session\nkeynote\nlightning\ntalk\ntrack_keynote\nworkshop\n\n../data/session_data/expo_session:\nComprehend First, Code Later The AI Skill I Rely On Daily.txt\nExpo Session – AMA AIE Europe with swyx.txt\n     167\n../data/session_data/talk/Engineering AI Systems Under Sovereignty Constraints.txt\n../data/session_data/talk/Connecting the Dots with Context Graphs.txt\n../data/session_data/talk/Lessons from Scaling GitHub's Remote MCP Server.txt\n../data/session_data/talk/Scaling the Next Paradigm of Heterogeneous Intelligence.txt\n../data/session_data/talk/Engineering AI Systems Under Sovereignty Constraints.txt\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_SqWPOrRycbG0wwkv2ryWeIYZ)\n Call ID: call_SqWPOrRycbG0wwkv2ryWeIYZ\n  Args:\n    commands: ['sed -n \\'1,200p\\' \"../data/session_data/talk/Engineering AI Systems Under Sovereignty Constraints.txt\"', 'sed -n \\'1,160p\\' \"../data/session_data/keynote/Gemma, DeepMind\\'s Family of Open Models.txt\"', 'sed -n \\'1,160p\\' \"../data/session_data/talk/Connecting the Dots with Context Graphs.txt\"']\nExecuting command:\n ['sed -n \\'1,200p\\' \"../data/session_data/talk/Engineering AI Systems Under Sovereignty Constraints.txt\"', 'sed -n \\'1,160p\\' \"../data/session_data/keynote/Gemma, DeepMind\\'s Family of Open Models.txt\"', 'sed -n \\'1,160p\\' \"../data/session_data/talk/Connecting the Dots with Context Graphs.txt\"']\n================================= Tool Message =================================\nName: terminal\n\n# Engineering AI Systems Under Sovereignty Constraints\n[...]\n**Description:** Regulatory and jurisdictional constraints are no longer an edge case in AI system design; [...]\n\n# Gemma, DeepMind's Family of Open Models\n[...]\n**Description:** Google DeepMind’s Gemma family is expanding. [...]\n\n# Connecting the Dots with Context Graphs\n[...]\n**Description:** AI systems need more than intelligence; they need context that persists. [...]\n================================== Ai Message ==================================\n\nThe sessions that discuss handling **regulatory constraints** (including sovereignty/jurisdiction/legal boundaries) are:\n\n- **“Engineering AI Systems Under Sovereignty Constraints”** (Talk) — April 9, 12:40–1:00pm, Room St. James  \n  *Focus:* how regulatory/jurisdictional constraints affect AI architecture, including data localization, auditability, transparency, and governance by design.\n\nIf you want, I can also list sessions that mention related topics like *compliance*, *GDPR*, or *governance* specifically (the dataset only clearly surfaced the one above under regulatory/constraint language).\n```\n\nThis works almost unreasonably well, but is the most effective? For example, the agent would have to chain together all animals if I’d ask it to find “movies with animal superheroes.”\n\nThat’s why were currently seeing many semantic search alternatives to `grep`\n\n: LlamaIndex’ [ semtools](https://github.com/run-llama/semtools), LightOn’s\n\n[, and Jina AI’s](https://github.com/lightonai/next-plaid/tree/main/colgrep)\n\n`colgrep`\n\n[.](https://github.com/jina-ai/jina-grep-cli)\n\n`jina-grep-cli`\n\nLet’s give our agent access to `jina-grep-cli`\n\nand see how that works. First, you have to install the CLI according to the instructions in the [repository](https://github.com/jina-ai/jina-grep-cli).\n\nThen, all you have to do is to tell your agent about it and how to use it in the system prompt:\n\n```\nYou are a search agent tasked with ...\n\n## Local filesystem (`session_data`)\n\nThe conference sessions are available under ...\n\n## Context retrieval tools\n\nYou have access to the following tools to find additional information:\n\n- **Terminal (bash / shell):** run shell commands on the host.\n\n### jina-grep: standalone semantic search\n\n[`jina-grep`](https://github.com/jina-ai/jina-grep-cli) is a CLI for semantic search over files. \n\nForm: \n````\njina-grep [OPTIONS] \"<query>\" [FILES]...`\n````\n\n**Grep-compatible (standalone):**\n\n- `-r`, `-R`: recursive search when the path is a directory  \n- `-l`: print only filenames with matches  \n... \n\n**Semantic:**\n\n- `--threshold`: cosine similarity threshold (default: `0.5`)  \n- `--top-k`: max results (default: `10`)  \n...\n\n**Examples** \n\n```` bash\njina-grep -r --top-k 5 \"memory leak\" /data/session_data\njina-grep -r --threshold 0.35 \"observability and evals\" /data/session_data/talk\njina-grep -r \"context engineering for agents\" /data/session_data/\n````\n\n### grep vs jina-grep\n\n- **Exact substring, known filename, or simple listing?** Use **`grep` / `find` / `cat`**.  \n- **Natural-language or fuzzy “what talks mention X?” over many `.txt` files locally?** Use **`jina-grep`** as above. Only run one jina-grep command at a time. Do NOT chain multiple jina-grep-cli commands together when calling the shell tool.\n```\n\nFor *“Which sessions discuss handling regulatory constraints?”* with this prompt loaded, the agent is now able to find the right session more efficiently:\n\n```\n================================ Human Message =================================\n\nWhich sessions discuss handling regulatory constraints?\n================================== Ai Message ==================================\nTool Calls:\n  terminal (call_6OrqroK96CcV0dzW9I8RqVX6)\n Call ID: call_6OrqroK96CcV0dzW9I8RqVX6\n  Args:\n    commands: ['ls -R ../data/session_data | head', 'jina-grep -r --top-k 10 \"regulatory constraints\" ../data/session_data']\nExecuting command:\n ['ls -R ../data/session_data | head', 'jina-grep -r --top-k 10 \"regulatory constraints\" ../data/session_data']\n================================= Tool Message =================================\nName: terminal\n\nexpo_session\nkeynote\nlightning\ntalk\ntrack_keynote\nworkshop\n\n../data/session_data/expo_session:\nComprehend First, Code Later The AI Skill I Rely On Daily.txt\nExpo Session – AMA AIE Europe with swyx.txt\nLoading model... done\n../data/session_data/talk/Engineering AI Systems Under Sovereignty Constraints.txt:1:# Engineering AI Systems Under Sovereignty Constraints ... [0.374]\n../data/session_data/talk/How to build a personal computer use agent and where the limits are.txt:1:# How to build a personal computer use agent and where the limits are ... [0.227]\n...\n../data/session_data/talk/Bringing MCPs to the Enterprise.txt:1:# Bringing MCPs to the Enterprise... [0.185]\n\n================================== Ai Message ==================================\n\nThe conference sessions that discuss handling **regulatory / jurisdictional constraints** include:\n\n- **“Engineering AI Systems Under Sovereignty Constraints”** (Talk) — *Apr 9, 12:40–1:00pm, St. James* — Speaker: **Bilge Yücel**  \n  *Focus:* how legal/jurisdictional requirements like data localization, auditability, and jurisdictional control translate into AI system architecture decisions.\n\nIf you meant a broader interpretation (e.g., “compliance”, “privacy”, “GDPR”, “risk”) tell me and I can search for those terms too.\n```\n\n## Practical recommendations\n\nIn this workshop, we looked at the broad landscape of search tools for context engineering and then explored a small selection of them to understand their strengths and weaknesses. The goal of this workshop is not to advocate for the one best search tool. Instead, I want to highlight the importance of curating a small set that matches how your agent actually behaves.\n\nA useful framing is “low floor, high ceiling”:\n\n- Specialized tools (e.g., semantic search with a topic string, lookup-by-ID) have simple parameters, fail less often, and cost fewer tokens.\n- General-purpose tools (shell, raw ES|QL) handle edge cases but may need more iterations and a stronger model.\n\nIdeally, you want both: easy wins for common queries and an escape hatch for the long tail.\n\nIf you already know what your agent’s query patterns and use cases are you can directly scope out specialized tools.\n\nIf you do not know query patterns yet, I recommend to start with a general-purpose tool. Then log tool calls and errors, and add specialized tools when you see repetition. Four or five tool calls for a simple question often means the current tool is too hard for the model. That is how I tend to evolve stacks: ship, read traces, narrow with purpose-built tools instead of guessing APIs on day one.\n\n## References\n\n[Back to top]", "url": "https://wpnews.pro/news/agentic-search-for-context-engineering", "canonical_source": "https://leoniemonigatti.com/blog/agentic-search-for-context-engineering.html", "published_at": "2026-06-06 05:45:08+00:00", "updated_at": "2026-06-06 06:16:50.196247+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "natural-language-processing", "ai-research", "ai-tools"], "entities": ["AI Engineer Europe", "Leonie", "LangChain", "YouTube", "GitHub"], "alternates": {"html": "https://wpnews.pro/news/agentic-search-for-context-engineering", "markdown": "https://wpnews.pro/news/agentic-search-for-context-engineering.md", "text": "https://wpnews.pro/news/agentic-search-for-context-engineering.txt", "jsonld": "https://wpnews.pro/news/agentic-search-for-context-engineering.jsonld"}}