{"slug": "the-model-is-swappable-the-ontology-compounds", "title": "The model is swappable the ontology compounds", "summary": "Databricks' Genie Ontology shifts enterprise AI focus from models to governed context layers, with human-defined semantics and machine-learned graphs compounding over time. The ontology, called the 'secret sauce' by CEO Ali Ghodsi, creates durable value by making models swappable while the context layer persists and improves.", "body_md": "[New Post:]\nThe model is swappable, the ontology compounds and why Databricks' Genie Ontology is Important\nProduct\nOverview\nPlatform Engineering\nAnalytics & BI\nIntegrations\nAbout\nBlog\nOpen Source\nBook a Demo\nProduct\nOverview\nPlatform Engineering\nAnalytics & BI\nIntegrations\nAbout\nBlog\nOpen Source\nBook a Demo\n\n```\n<< goback()OpinionThe model is swappable, the ontology compoundsKostas PardalisCo-FounderJune 19, 2026The model is becoming the swappable part.\nThe ontology is what compounds.\nThat, to me, is the important idea behind Databricks’ Genie Ontology. The persistent state that gets smarter with every query, every certified metric, every dashboard, every lineage edge, every business definition, and every correction from a human is the actual moat.\nThis is why Ali Ghodsi calling the Genie Ontology the “secret sauce” matters.\nIt signals that the industry is moving into a new phase. For the last couple of years, a lot of the conversation around AI products has been about models: which model is better, which model is cheaper, which model has the larger context window, which model is better at reasoning, coding, retrieval, or tool use.\nThat still matters, but I think the center of gravity is shifting.\nFor enterprise AI, the durable value will not live only in the model. It will live in the governed context layer around the model.\n\nThe model can be replaced but the context compounds.\n\nAnd that is why ontologies, which for years were treated as important but somewhat niche, are about to become much more central to how companies think about AI systems.\nSo, what is the Genie Ontology, and why is it interesting?\nBased on what Databricks has shared so far, and what I managed to learn during the Databricks conference, here is my current mental model.\nWhat is Databricks Genie Ontology?\nThe simplest way to think about Genie Ontology is this:\nHumans define the canonical business meaning.\nMachines learn how that meaning shows up across the company.\nHumans correct the machine when it drifts.\nThe system gets better over time.\nMy current mental model is that there are two interacting layers.\nThe first layer is the governed semantic foundation.\nThe second layer is the machine-learned ontology that continuously builds enterprise context on top of that foundation.\nLayer 1: The governed semantic foundation\nThe first layer is human-defined and curated.\nThis is the part that lives in and around Unity Catalog. It includes things like business glossaries, domains, and metrics.\nThe business glossary defines authoritative concepts, terms, and taxonomies. In theory, if a company already has some of this work done elsewhere, those definitions can be imported or migrated into the governed layer.\nDomains group assets in a way that maps to the business. They create ownership and stewardship boundaries. They also help communicate which assets are trusted, certified, or relevant to a specific part of the organization.\nMetrics define reusable business measures like revenue, churn, active users, gross margin, pipeline, or retention. The important part is that these are not just random SQL snippets copied across dashboards. They are governed objects that can be reused consistently by BI tools, agents, notebooks, dashboards, and downstream applications.\nThis layer matters because AI cannot answer business questions reliably if the business itself has not defined what the words mean.\nIf “revenue” means one thing to finance, another thing to sales, and a third thing to the product analytics team, the model is not going to magically fix that. At best, it will pick one definition. At worst, it will confidently mix them together.\nSo the governed layer provides the canonical starting point.\nBut that is not enough.\nBecause the official definition of the business is only one part of the story. The other part is how the business actually operates.\nThat is where the machine-learned layer comes in.\nLayer 2: The machine-learned enterprise context graph\nThe Genie Ontology appears to use the governed semantic foundation as input into a broader, continuously learned enterprise context layer.\nThis is the part that tries to understand how the company actually works by looking across tables, queries, dashboards, pipelines, documents, connected applications, and workplace tools.\nIn other words, it is not just looking at schema, it is also looking at usage and that distinction is important.\nA schema can tell you that a table has a column called customer_id.\nIt cannot tell you whether that column is trusted, whether the table is deprecated, whether analysts actually use it, whether executives rely on a dashboard built from it, whether another team has a better version, or whether the metric calculated from it is the one the business actually considers authoritative.\nTo know that, you need more than metadata, you need context.\nThat context comes from many places: lineage, query history, dashboards, pipelines, documents, certifications, permissions, domains, connected apps, and the way people actually interact with the data.\nThis is where Databricks is making a very interesting bet.\nThe system is not just retrieving data at query time. It is building a persistent representation of the enterprise in the background.\nA living graph of business meaning.\nThe hard problem is authority\nThe hard problem in enterprise analytics is usually not that there is no answer.\nThe hard problem is that there are too many possible answers.\nThere are multiple revenue definitions. Multiple customer tables. Multiple churn calculations. Multiple dashboards that appear to answer the same question. Multiple teams using similar words in slightly different ways.\nSo the core problem becomes:\nWhich definition deserves to win?\nThis is where OntoRank becomes the most interesting part of the system.\nDatabricks describes OntoRank as similar to PageRank. Instead of ranking web pages, it ranks business definitions and relationships.\nWhen multiple definitions of the same concept exist, OntoRank appears to weigh signals like:\n\nWhere the definition came from.\nWho authored it.\nHow widely it is used.\nHow closely it connects to certified and widely used assets.\nHow recently it was updated.\n\nA simple way to think about it is:\nOntoRank is PageRank over business meaning, where certified, popular, fresh, well-connected, and authored-by-the-right-person definitions win.\nThat is a very elegant idea.\nIf it works, Databricks has a scalable way to resolve enterprise semantic ambiguity without requiring humans to manually curate every possible definition, relationship, and edge case.\nIf it does not work, Genie Ontology risks becoming another noisy metadata graph with a much better interface.\nThat is why I think OntoRank is the highest-leverage technical bet inside Genie Ontology.\nThe success of the system will not depend only on OntoRank, of course. It will also depend on governance adoption, connector coverage, lineage quality, human review workflows, metric-layer quality, permissions, and whether enough of the company’s activity is visible to the system.\nBut OntoRank feels like the center of gravity.\nBecause the real question is not whether Databricks can collect a lot of metadata.\nThe question is whether it can rank meaning.\nQuery history is behavior, not truth\nOne part of this that I find especially interesting is the role of usage signals.\nFrom what Databricks has shared, query history is an important signal. Column popularity, for example, can be derived from how often historical queries read from a column.\nThat means the system is going a step further and is asking what do people actually use?\nThis makes sense. In a large company, usage is a valuable signal. If many important dashboards, workflows, and queries depend on a specific table or column, that probably tells you something.\nBut usage is also dangerous.\nQuery history is behavior, not truth.\nA table can be popular because it is correct. It can also be popular because it is old, convenient, badly documented, or accidentally copied into a hundred dashboards years ago.\nA metric can be widely used and still be wrong.\nA query can be common and still encode a misunderstanding.\nThis is where the OntoRank bet becomes subtle.\nThe bet is not simply that query history is useful.\nThe bet is that usage history, when combined with governance, lineage, freshness, authorship, certification, and graph structure, can become a trustworthy authority signal.\nThat is very different from just retrieving old SQL queries and giving them to an agent.\nAnd this is where the comparison with Anthropic is interesting.\nThe Anthropic tension\nA few weeks ago, Anthropic shared how they built their internal analytics context layer. One thing that stood out was that raw query history did not help them very much when used directly. Giving the agent access to a large set of historical queries barely improved accuracy.\nTheir conclusion seemed to be that query history is too noisy to be treated as a direct source of truth.\nBut it can still be valuable as raw material.\nYou can mine it. You can distill it. You can turn recurring patterns into curated reference docs, reusable analysis workflows, and better semantic context.\nThat makes the contrast with Databricks interesting.\nAnthropic seems to be saying:\nRaw query history is noisy. Distill it before using it as context.\nDatabricks seems to be betting:\nUsage history can become authority signal if it is embedded inside a governed graph and ranked properly.\nThose are not necessarily contradictory positions.\nIn fact, they may be two versions of the same idea.\nThe question is not whether query history is useful.\nThe question is whether you can transform query history into reliable semantic signal.\nThat is what I want to watch closely with Genie Ontology.\nIf OntoRank can turn usage exhaust into trustworthy business context, that is a big deal.\nIf not, the human-governed layer will have to do much more of the work than the product positioning might imply.\nWhy this is different from RAG\nThis is also why I think Genie Ontology is best understood as part of a broader shift from RAG to ontology.\nRAG systems usually retrieve chunks of text at query time and ask the model to reason from them.\nThat can work well in many situations. But in enterprise analytics, it often breaks down because the hard problem is not simply finding relevant text.\nThe hard problem is knowing which definition is authoritative, which data source is trusted, which metric should be used, which permissions apply, and which business context matters.\nIf a model retrieves three different explanations of “revenue,” it still needs to decide which one to trust.\nIf it retrieves an outdated dashboard, it may answer confidently with the wrong business logic.\nIf it retrieves SQL from a historical query, it may reproduce a pattern that was popular but incorrect.\nThe ontology approach front-loads more of the work.\nInstead of assembling context only at query time, the system continuously builds a governed graph in the background. It ranks definitions. It connects metrics to tables. It uses lineage. It respects permissions. It routes the agent toward certified data and governed computation instead of letting it free-associate over scattered fragments.\nThe agent does not just ask the model to reason over random retrieved text.\nThe agent can use the ontology to resolve the right business concept, find the right metric, identify the right data assets, and generate SQL against governed data.\nThat is the shift.\nRAG retrieves context while ontology maintains context.\nRAG asks, “What relevant chunks can I find right now?” while ontology asks, “What does this business mean, and which meaning should be trusted?”\nThat is a much more powerful foundation for enterprise AI.\nWhy I think this matters\nI am excited about Genie Ontology because it signals that the industry is entering the real building phase of AI.\nWe have spent a lot of time proving that LLMs can answer questions, write code, summarize documents, and use tools.\nNow the question is:\nWhat infrastructure do we need to turn models into reliable products?\nI think governed context is one of the most important pillars.\nA model without business context is a general-purpose reasoning engine. Useful, but not enough.\nA model with access to governed, current, permission-aware, company-specific context becomes something much more valuable.\nIt can answer questions the way the business actually thinks. It can use the right definitions. It can respect the right boundaries. It can connect analytics to operations. It can become part of workflows instead of just a chat interface on top of documents.\nThat is why ontology matters.\nAnd that is why I think Databricks putting this much emphasis on Genie Ontology is important.\nIt makes the ontology conversation mainstream.\nThe manual part still matters\nThat said, there are still important open questions.\nThe first one is the human part.\nDatabricks is clearly trying to automate as much of the ontology creation and maintenance as possible. That is the right direction. Purely manual ontology-building has failed many times before because the work is too slow, too political, and too hard to keep fresh.\nBut there is still a governed foundation that humans need to define, curate, approve, or import.\nThat work does not disappear.\nFor companies with mature data governance practices, this may be manageable. If they already have business glossaries, certified metrics, domains, ownership models, and stewardship workflows, then Genie Ontology may be able to build on top of what already exists.\nBut many companies are not there.\nFor them, the question is:\nHow much work is required before the automated layer becomes useful?\nThis is not a small question.\nIf the governed foundation is weak, the machine-learned layer may learn from noise.\nIf the official definitions are incomplete, the system has to infer more.\nIf ownership is unclear, human-in-the-loop curation becomes harder.\nIf teams disagree about metrics, the ontology may surface the disagreement, but it cannot magically resolve the politics.\nSo I am very interested to see what the actual bootstrapping experience looks like.\nHow much can be imported? How much can be inferred? How much needs to be manually approved? How quickly does the system become useful? How much ongoing governance work is required to keep it useful?\nThat will matter a lot.\nThe governance angle is a big deal\nOne thing I do really like about the Databricks approach is that governance is not treated as an afterthought.\nIn enterprise AI, governance cannot be bolted on at the end.\nIf an agent is answering questions from company data, it needs to respect permissions. It needs to know which data a user can see. It needs to know which tools it can call. It needs to avoid leaking information through summaries, generated SQL, or retrieved context.\nThe promise of building this on top of Unity Catalog is that permissions, lineage, certification, ownership, and access control are part of the foundation.\nFor many enterprises, this may be the biggest reason to take the Databricks approach seriously.\nNot because the ontology idea is interesting in the abstract, but because a governed ontology inside the data platform may be much easier to trust than a separate AI layer that has to reconstruct permissions from the outside.\nBut this also leads to the biggest strategic question.\nIf context is the moat, whose moat is it?\nDatabricks is right that the model is not the moat. The context is.\nBut we need to be very clear about whose moat we are talking about.\nFor the business, the context layer is incredibly valuable because it is a digital representation of how the company works.\nIt contains the business definitions, the operating logic, the trusted metrics, the relationships between systems, the way teams think about customers, products, revenue, risk, operations, and growth.\nIn a very real sense, it encodes what makes the company unique.\nYou absolutely want to build this context layer.\nYou want it because it allows AI systems to operate on the actual logic of your business, not just generic knowledge from the internet.\nYou want it because it can help scale decision-making.\nYou want it because it can make analytics, operations, support, finance, product, and go-to-market workflows smarter.\nBut Databricks also wants to be the platform where this context lives. And that is understandable.\nWhat bigger lock-in factor is there than becoming the brain and nervous system of the enterprise?\nTo be fair, this is not simply a story of closed vendor lock-in. Databricks is making moves around open semantics, federation, connectors, APIs, and interoperability. That matters, and it should not be ignored.\nBut even if the definitions are portable, the continuously learned context graph may not be and that is the key distinction.\nThe static semantic definitions may be exportable. The compounding context may still accrue inside the platform.\nThe lineage graph, usage telemetry, governance workflows, agent behavior, query patterns, materializations, permissions, integrations, and feedback loops may become more valuable the more deeply you live inside the Databricks ecosystem.\nThat creates a powerful flywheel.\nThe more of your business you connect to Databricks, the better the ontology becomes.\nThe better the ontology becomes, the more valuable Databricks becomes.\nThe more valuable Databricks becomes, the more incentive you have to move additional systems, workflows, and teams into Databricks.\nThat is great for Databricks.\nThe question is whether it is always great for you as the customer.\nThe context layer should be neutral infrastructure\nMy take is that companies should be very careful here.\nThe enterprise context layer is too important to be treated as just a feature inside one platform.\nIt should be sound, it should be complete and it should be interoperable.\nBy sound, I mean governed, versioned, permission-aware, auditable, and tied to real ownership. The system needs to know which definitions are certified, who owns them, how they changed, and where they are used.\nBy complete, I mean it needs to represent the business across systems, not just inside the analytics stack. Your business context does not live only in the warehouse. It also lives in product systems, logistics systems, CRM, support tools, documents, Slack threads, workflows, planning tools, and operational applications.\nBy interoperable, I mean every system that needs to reason about the business should be able to consume that context: Databricks, Snowflake, BI tools, agents, internal apps, workflow engines, customer-facing products, and whatever comes next.\nThis is where vendor incentives become tricky.\nEvery major platform wants to be the home for your context layer. Databricks, Snowflake, Salesforce, Microsoft and even the customer support and ticketing system vendors you are using, will try to claim it.\nEvery system of record and every AI platform will have an incentive to make its own environment the place where your business meaning compounds.\nBut your business does not fit cleanly inside one vendor’s walls.\nEven mature organizations that standardize heavily on one platform usually still use many others. They do that because different teams need different tools, because some systems are better for specific workloads, because migrations take years, and because optionality matters.\nYou should migrate when a platform is clearly the right place for a workload.\nYou should not be forced to migrate just because that is the only way your AI context layer can remain complete.\nThat is the real concern. Not that Databricks is doing something wrong by building Genie Ontology. I think they are doing something very important.\nThe concern is that the most valuable representation of your business should not become trapped inside one vendor’s runtime.\nThe real AI moat is sound and complete context\nAli was right that context is the moat.\nBut for context to be useful, it needs to be both sound and complete.\nIf it is sound but incomplete, agents will reason correctly over only part of the business.\nIf it is complete but not sound, agents will have access to everything but will not know what to trust.\nYou need both.\nThat is hard because every vendor is naturally incentivized to make context complete inside its own platform. But companies need context that is complete across the business.\nThis distinction is going to matter more as AI moves from answering analytics questions to executing workflows.\nAgentic analytics is only the beginning.\nThe bigger opportunity is connecting analytics, operations, product, finance, go-to-market, logistics, and customer workflows into systems that can reason and act with business context.\nThe moment you need to bridge your analytics infrastructure with your product infrastructure, or your logistics systems, or your customer-facing workflows, a platform-specific context layer can become a wall.\nThe only way to unlock the full value of AI is to curate a sound and complete representation of the business and make it consumable by every system that needs it.\nThis is what we are building at Typedef\nThis is exactly the problem we are working on at Typedef.\nWe believe companies need a neutral context layer for AI.\nOne that lets them define, govern, and maintain the meaning of their business in a way that is portable across platforms.\nOne that can be consumed by agents, applications, workflows, data platforms, and operational systems.\nOne that does not require companies to choose between using the best tools for each job and having a complete representation of their business.\nDatabricks’ Genie Ontology is exciting because it validates the direction.\nIt shows that the industry is waking up to the fact that models are not enough.\nThe next major layer of enterprise AI is the context layer.\nThe model is swappable and the ontology compounds.\nThe only question is whether that compounding context becomes a vendor moat or your business moat.\nIf you are thinking about how to build that business moat in a way that stays governed, complete, and interoperable, I would love to show you what we are building at Typedef.\n```\n\n", "url": "https://wpnews.pro/news/the-model-is-swappable-the-ontology-compounds", "canonical_source": "https://www.typedef.ai/blog/the-model-is-swappable-the-ontology-compounds", "published_at": "2026-06-20 01:49:48+00:00", "updated_at": "2026-06-20 02:07:13.013620+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-products", "ai-infrastructure", "ai-ethics", "ai-research"], "entities": ["Databricks", "Genie Ontology", "Ali Ghodsi", "Unity Catalog"], "alternates": {"html": "https://wpnews.pro/news/the-model-is-swappable-the-ontology-compounds", "markdown": "https://wpnews.pro/news/the-model-is-swappable-the-ontology-compounds.md", "text": "https://wpnews.pro/news/the-model-is-swappable-the-ontology-compounds.txt", "jsonld": "https://wpnews.pro/news/the-model-is-swappable-the-ontology-compounds.jsonld"}}