{"slug": "reliable-knowledge-extraction-for-ai-systems", "title": "Reliable Knowledge Extraction for AI Systems", "summary": "Knowledge graphs are emerging as a solution to retrieval failures in AI systems, where models cannot answer questions despite relevant documents existing. By structuring knowledge as entities and relationships, GraphRAG architectures improve fact-based retrieval over traditional vector search, addressing a gap that larger language models alone cannot close.", "body_md": "Have you run into a case where the model cannot answer a question while the answer is plainly somewhere in the documents, even with a retrieval pipeline connected? Or where an agent returns passages that read as relevant yet never mention the one term the customer actually asked about?\n\nBoth are symptoms of the same gap: retrieval that returns text which *looks* like the question without returning the facts that answer it. A larger or newer language model does not close that gap on its own. The limitation sits earlier in the pipeline, in how knowledge is stored and searched, well before the model gets a chance to summarize whatever it was handed.\n\n**Knowledge graphs** have been gaining traction as one way to address this. They can serve as a construction and retrieval layer for a RAG system and, increasingly, as the persistent memory an agent reasons over between steps. This post looks at how they fit together with large language models, and why it is worth understanding the underlying design rather than any single library that implements it.\n\nThe article is organized as follows. It opens with the case for learning patterns rather than chasing frameworks, a shift in perspective that matters given how quickly AI tooling changes. It then defines what a knowledge graph is. In later sections, it moves from ordinary vector RAG to GraphRAG, using a small worked example to show where the difference bites. The main part walks through a general GraphRAG architecture and five parts that show up across many systems. It closes with the trade-offs the architecture carries and a note on a library built to put these patterns into practice.\n\nThis text is the first in a longer series that goes into more depth on each component. The code for the library and the examples mentioned here are available in the [GraWiki repository on GitHub](https://github.com/maddataanalyst/grawiki) and in the [documentation](https://grawiki.readthedocs.io/).\n\nIf you have tried to keep up with the AI tooling space lately, you already know the problem. Every few weeks brings a new framework, a new orchestration layer, a new “agentic” wrapper around something that existed last quarter under a different name. By the time you have read the documentation, half of it is deprecated.\n\nI have stopped trying to memorize all the tools. What survives the churn is the set of underlying ideas: how retrieval works, how context gets assembled, how a system decides what to feed a model. Most of the new frameworks are recombinations of a number of recurring patterns. Once you can name those patterns, a new library stops being a thing to learn from scratch and becomes a variation on something you already understand, with some minor syntax quirks.\n\nThis is the approach Chip Huyen takes in *AI Engineering* (Huyen 2024), where the focus is on the general shape of AI systems rather than on any single vendor’s stack. I want to do the same for one specific corner of the field that I know best: knowledge graphs and how they combine with large language models.\n\nThis post stays at the architectural level, answering three questions:\n\nLater posts will go into the parts that deserve their own treatment: entity extraction, deduplication, retrieval, and reasoning over the graph.\n\nThe term is older than the current wave of interest, and it has never had one tight definition. Surveys and textbooks each phrase it slightly differently (Hogan et al. 2021; Negro et al. 2025; Barrasa and Webber 2023; Bratanic and Hane 2025), but the common ground is clear enough. A knowledge graph is a graph built to represent real-world objects and the relationships between them, in a form that both people and machines can interpret and reason over. Vertices stand for entities — people, places, products, diseases — and edges stand for the relations that connect them.\n\nTwo things are worth pulling out of those definitions. First, **a knowledge graph is not tied to any particular technology, query language, or storage engine**. Second, what makes it a *knowledge* graph rather than a plain data graph is that it stays semantically and structurally coherent, so that the relationships actually support inference.\n\nFollowing Hogan et al. (2021), three elements show up in most knowledge graphs, each with many possible variants:\n\nThere is no single way to create such a structure. At one end sit formal ontologies expressed in RDF and OWL, with high expressiveness and a matching cost in effort. At the other end sit labeled property graphs (LPG), which trade some formal rigor for flexibility and a lower barrier to entry. That is one of the reasons LPGs are popular in many practical applications, especially commercial ones. Which one fits depends on the domain and on how much formalism you can afford. I will leave that comparison for a later post, since it deserves more than a paragraph.\n\nA generated infographic below summarizes the concept.\n\nRetrieval-augmented generation gives a language model access to knowledge it was not trained on. In its most common form, the corpus is split into chunks, each chunk is embedded as a vector, and retrieval returns the chunks whose embeddings are closest to the query. This works, and for many question-answering tasks it works well.\n\nIt runs into trouble on a specific class of questions. When an answer has to combine facts that live in different parts of the corpus, or when the question is about the corpus as a whole rather than any single passage, similarity between independent chunks is not enough (Gao et al. 2024; Peng et al. 2024; Han et al. 2024). Embedding similarity finds passages that *look* like the query. It does not follow the connections between the things those passages describe.\n\nThe GraphRAG paradigm changes what gets retrieved. Instead of independent text fragments, the system retrieves connected elements of a knowledge graph — entities, triples, reasoning paths, whole subgraphs — that carry relational structure a similarity score cannot. Sanmartin captures the shift with the phrase “search things, not strings” (Sanmartin 2024): you are looking for the objects that matter because of how they connect, not only the text that mentions them. Using structured relational knowledge this way is one of the main directions in the literature on combining knowledge graphs with language models (Pan et al. 2023; Pan et al. 2024; Linders and Tomczak 2025).\n\nA small example makes the difference concrete. Suppose a handful of documents about the history of AI have been turned into the graph in Figure 1: entities such as\n\njoined by relations like *proposed*, *used to evaluate*, and *coined the term*. Now ask: *“Who proposed the test used to evaluate artificial intelligence?”* The phrase “Alan Turing” need not appear anywhere near the words “evaluate artificial intelligence,” so a chunk ranked by embedding similarity may never surface him. The graph answers the question by following two edges: from *Artificial Intelligence* back along *used to evaluate* to the *Turing Test*, and from there back along *proposed* to *Alan Turing*. The answer is a node you reach by following edges, which embedding similarity alone would likely have skipped past.\n\nThe two approaches are not rivals so much as different points on a spectrum, and most real systems borrow from both. The table below summarizes where they differ.\n\nIn the usual three-stage account of RAG — naive, advanced, modular (Gao et al. 2024) — GraphRAG belongs to the modular family, with swappable retrieval components and room for iterative or adaptive search.\n\nImplementations vary, but survey work finds the same set of parts across many GraphRAG systems (Peng et al. 2024; Han et al. 2024). That makes it possible to describe one general architecture and treat specific products as instances of it.\n\nA GraphRAG system uses a knowledge graph as the external knowledge source. A compact way to write it is [1]:\n\nHere, *V* is the set of vertices, *E* is the set of edges, Φ: V -> Tv maps each vertex (V) to a vertex type(Tv) such as document, text fragment, entity, claim, or community, and ψ: E -> Tr, maps each edge (E) to a relation type (Tr) such as *mentions*, *supports*, *belongs to*, or *connects to*. The schema defines the allowed vertex and edge types, together with their properties.\n\nThe system operates in two phases, as in ordinary RAG:\n\nEach of these phases can be further divided into components shown in Figure 2.\n\nSome frameworks (such as LlamaIndex) include modules that map directly onto these conceptual phases — the Retriever, the ResponseSynthesizer, and so on.\n\nThe first component builds the graph that everything else draws on. In practice there are four families of methods (Peng et al. 2024; Han et al. 2024), shown in Figure 3: manual construction by domain experts, which gives high quality at a high cost; rule-based extraction with parsers and language rules, which is predictable but limited to the patterns the rules cover; machine-learning methods that classify entities and relations, more flexible but dependent on training data; and LLM-based extraction, where the model reads a text fragment and returns its entities, their descriptions, and the relations between them.\n\nExtraction usually runs per chunk, on the same fragments produced by the document splitting that RAG already relies on. From each chunk the system pulls entities, relations, and sometimes **claims** — statements that pin an entity to a piece of context. Claims come from Microsoft’s GraphRAG design, where they are a distinct element used downstream (Edge et al. 2025).\n\nThe partial graphs from individual chunks then have to be stitched into one. The seams are shared entities: when a chunk from one document and a chunk from another both mention the same entity, that entity joins their partial graphs into a single structure (Figure 4). Keeping the links between a document, its chunks, and the extracted elements is what lets the system trace any fact back to its source later.\n\nOnce the graph exists, it has to be organized for fast retrieval. Three connected concerns can be found here.\n\nThe first is **entity resolution** (also called deduplication or entity matching). Chunks extracted separately often produce several nodes for the same real object, written differently each time. Resolution merges them into one. Two broad families of technique apply (Barlaug and Gulla 2021; Christen 2012):\n\nHybrid and LLM-assisted variants exist too, and I will come back to them in a dedicated post.\n\nThe second is **hierarchical indexing**. The graph can be partitioned into **communities** — groups of densely connected nodes that correspond to coherent topics (Peng et al. 2024; Edge et al. 2025). The Leiden algorithm is a common choice, building on Louvain with better-connected groups (Hairol Anuar et al. 2021). Because these algorithms are hierarchical, you get communities at several levels of abstraction. A language model then writes a **summary** of each community, stored as its own part of the graph. Those summaries are what let the system answer broad questions about whole topic areas without first reassembling scattered facts.\n\nThe third is **persistence** — how the graph and its supporting structures are stored (Peng et al. 2024; Han et al. 2024). Three index forms tend to coexist: graph indexes for the topology of nodes and edges, usually in a graph database; text indexes for full-text search over entity descriptions, relations, and community summaries; and vector indexes for semantic search over embeddings of nodes, edges, and chunks. Combining them lets retrieval use both graph structure and semantic similarity. Figure 5 shows how the pieces typically relate. Most graph databases — Neo4j, FalkorDB, Memgraph — can hold all three at once.\n\nOn the query side, the first component bridges a natural-language question and a structured graph. Its job is to turn the question into something the graph can be searched with. Depending on the system, that can involve several operations (Peng et al. 2024; Han et al. 2024), laid out in Figure 6:\n\nNot all of these happen every time. A simple system might only recognize entities and pick seed nodes; a more elaborate one runs the full decomposition and structuring. How far you go depends on the questions you expect and the retrieval method you pair it with.\n\nThe fourth component, sometimes called G-Retrieval, pulls the elements of the graph most relevant to the query. It selects a subgraph or path that maximizes a relevance function against the query. Formally it can be represented as in [2]:\n\nHow that relevance function is defined, and how the space of subgraphs is searched, is where methods diverge (Peng et al. 2024; Han et al. 2024). They vary along a few axes: the search paradigm (one-shot, iterative, or multi-stage), the search method (graph heuristics like breadth- and depth-first traversal, or learned scoring with graph neural networks), and the granularity of the result (single entities, triples, full relational paths, or subgraphs). This component decides how much of the graph’s knowledge actually reaches the model, which is why it carries so much weight. It is also large enough to warrant its own post, so I will only flag it here.\n\nThe last component, G-Generation, turns retrieved graph elements into something a language model can use, then generates the answer. Organization mainly involves two steps (Peng et al. 2024; Han et al. 2024): pruning, which drops nodes and relations that add noise so the context stays within bounds, and verbalization, which converts graph structures into natural-language text, for instance by turning triples into sentences with templates or a model.\n\nFor corpus-wide questions there is an extra pattern: map-reduce global sensemaking (Edge et al. 2025), introduced by Microsoft GraphRAG. In the map step, partial answers are generated from individual community summaries; in the reduce step, they are merged into one. As with retrieval, the details belong in the post on retrieval and reasoning.\n\nThe five components:\n\ngive us a useful way to reason about GraphRAG systems. Each has several viable implementations, and a given system mixes and matches them to fit its data and constraints. That modularity is the reason it is worth learning the architecture instead of a single tool built on it.\n\nIt does not come for free, though. Building the graph costs more than embedding chunks, entity resolution is genuinely hard and rarely perfect, and the extra moving parts add latency and operational weight. GraphRAG earns its keep on multi-hop and corpus-wide questions; for straightforward lookup, plain vector RAG is often the saner choice. The point of the architecture is to know which parts you actually need.\n\nThe next posts in the series take the components one at a time, starting with extraction and deduplication, then moving to retrieval and reasoning over the graph.\n\nThe infographic below summarizes the architecture described in this post.\n\nI have a soft spot for Feynman’s line, “What I cannot create, I do not understand.” After spending a long time reading about knowledge graphs and trying the established tools, I kept running into the same thing: each well-known framework covers most of the architecture above but leaves out one or two of its parts. For example, LlamaIndex has no real deduplication step — entities that should merge stay split. The gaps are rarely in the same place twice, which is exactly what makes it hard to learn the whole pattern from any one of them.\n\nSo I started building my own, mostly to force myself to understand every component by writing it, and then to fill in the parts I kept missing elsewhere: [GraWiki](https://grawiki.readthedocs.io/) ([source on GitHub](https://github.com/maddataanalyst/grawiki)). It is early-stage and very much a work in progress, so treat this as a pointer rather than a recommendation -somewhere to see the components above wired together in code rather than described in prose. Additionally, it will serve me for illustrative purposes when writing posts, articles and parts of the upcoming book on knowledge graphs and LLMs. The library is designed to be backend-agnostic, so you can swap out the graph database or vector store for whatever fits your needs.\n\nThe design follows the architecture in this post fairly directly. Document processing handles loading and chunking; an extraction layer uses an LLM to pull entities and relations; storage runs through a Cypher engine over FalkorDB or Memgraph; retrieval combines text and vector search with graph expansion around matched entities; and a similarity module does the deduplication that sent me down this road in the first place. The same graph also serves as persistent memory for an LLM agent, which is the other use case the project is exploring. If any of the components here are something you would rather take apart and rebuild yourself, it may be a useful starting point.\n\nHopefully you find this article interesting. If so — please leave the clap ❤. Stay tuned for the next parts!\n\nBest regards,\n\n*Filip W.*\n\n**References**\n\nBarlaug, Nils, and Jon Atle Gulla. 2021. “Neural Networks for Entity Matching: A Survey.” *ACM Transactions on Knowledge Discovery from Data (TKDD)* 15 (3): 1–37.\n\nBarrasa, J., and J. Webber. 2023. *Building Knowledge Graphs*. O’Reilly Media. [https://books.google.pl/books?id=6sTGEAAAQBAJ](https://books.google.pl/books?id=6sTGEAAAQBAJ).\n\nBratanic, T., and O. Hane. 2025. *Essential GraphRAG: Knowledge Graph-Enhanced RAG*. Manning. [https://books.google.pl/books?id=pTtyEQAAQBAJ](https://books.google.pl/books?id=pTtyEQAAQBAJ).\n\nChristen, Peter. 2012. “The Data Matching Process.” In *Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection*. Springer.\n\nEdge, Darren, Ha Trinh, Newman Cheng, et al. 2025. *From Local to Global: A Graph RAG Approach to Query-Focused Summarization*. [https://arxiv.org/abs/2404.16130](https://arxiv.org/abs/2404.16130).\n\nGao, Yunfan, Yun Xiong, Xinyu Gao, et al. 2024. *Retrieval-Augmented Generation for Large Language Models: A Survey*. [https://arxiv.org/abs/2312.10997](https://arxiv.org/abs/2312.10997).\n\nHairol Anuar, Siti Haryanti, Zuraida Abal Abas, Norhazwani Mohd Yunos, et al. 2021. “Comparison Between Louvain and Leiden Algorithm for Network Structure: A Review.” *Journal of Physics: Conference Series* 2129: 012028.\n\nHan, Haoyu, Yu Wang, Harry Shomer, et al. 2024. “Retrieval-Augmented Generation with Graphs (GraphRAG).” *ArXiv* abs/2501.00309. [https://doi.org/10.48550/arxiv.2501.00309](https://doi.org/10.48550/arxiv.2501.00309).\n\nHogan, Aidan, Eva Blomqvist, Michael Cochez, et al. 2021. *Knowledge Graphs*. Synthesis Lectures on Data, Semantics, and Knowledge 22. Springer. [https://doi.org/10.2200/S01125ED1V01Y202109DSK022](https://doi.org/10.2200/S01125ED1V01Y202109DSK022).\n\nHuyen, C. 2024. *AI Engineering*. O’Reilly Media.\n\nLinders, Jasper, and Jakub M Tomczak. 2025. “Knowledge Graph-Extended Retrieval Augmented Generation for Question Answering.” *Applied Intelligence* 55 (17): 1102.\n\nNegro, Alessandro, Vlastimil Kus, Giuseppe Futia, and Fabio Montagna. 2025. *Knowledge Graphs and LLMs in Action*. Simon; Schuster.\n\nPan, Jeff Z., Simon Razniewski, Jan-Christoph Kalo, et al. 2023. *Large Language Models and Knowledge Graphs: Opportunities and Challenges*. [https://arxiv.org/abs/2308.06374](https://arxiv.org/abs/2308.06374).\n\nPan, Shirui, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu. 2024. “Unifying Large Language Models and Knowledge Graphs: A Roadmap.” *IEEE Transactions on Knowledge and Data Engineering* 36 (7): 3580–99. [https://doi.org/10.1109/TKDE.2024.3352100](https://doi.org/10.1109/TKDE.2024.3352100).\n\nPeng, Boci, Yun Zhu, Yongchao Liu, et al. 2024. “Graph Retrieval-Augmented Generation: A Survey.” *ACM Transactions on Information Systems* 44: 1–52. [https://doi.org/10.1145/3777378](https://doi.org/10.1145/3777378).\n\nSanmartin, Diego. 2024. “Kg-Rag: Bridging the Gap Between Knowledge and Creativity.” *arXiv Preprint arXiv:2405.12035*.\n\n[Reliable Knowledge Extraction for AI Systems](https://pub.towardsai.net/reliable-knowledge-extraction-for-ai-systems-57abcba16a3a) was originally published in [Towards AI](https://pub.towardsai.net) on Medium, where people are continuing the conversation by highlighting and responding to this story.", "url": "https://wpnews.pro/news/reliable-knowledge-extraction-for-ai-systems", "canonical_source": "https://pub.towardsai.net/reliable-knowledge-extraction-for-ai-systems-57abcba16a3a?source=rss----98111c9905da---4", "published_at": "2026-06-21 14:01:03+00:00", "updated_at": "2026-06-21 14:38:59.959643+00:00", "lang": "en", "topics": ["large-language-models", "ai-research", "ai-infrastructure"], "entities": ["Chip Huyen", "GraWiki", "GitHub", "Read the Docs", "Hogan et al.", "Negro et al.", "Barrasa and Webber", "Bratanic and Hane"], "alternates": {"html": "https://wpnews.pro/news/reliable-knowledge-extraction-for-ai-systems", "markdown": "https://wpnews.pro/news/reliable-knowledge-extraction-for-ai-systems.md", "text": "https://wpnews.pro/news/reliable-knowledge-extraction-for-ai-systems.txt", "jsonld": "https://wpnews.pro/news/reliable-knowledge-extraction-for-ai-systems.jsonld"}}