Building a Trusted Semantic Layer with Snowflake Horizon Context

Snowflake Horizon Context enables a trusted semantic layer for enterprise AI by using domain-oriented Semantic Views, metric certification workflows, and governance at the meaning layer. Without such a layer, AI agents make decisions based on unvalidated definitions, leading to governance failures. The approach recommends starting with high-impact metrics like revenue, ARR, churn, and active users.

In Part 1, we established why enterprise AI struggles: context fragmentation means the same question yields different answers depending on which system resolves it. Snowflake Horizon Context addresses this architecturally — but architecture without implementation is just a diagram on a whiteboard. This article is where we build. We’ll design domain-oriented Semantic Views, implement metric certification workflows, establish governance at the meaning layer, define organizational responsibilities, and walk through the production patterns that separate a proof-of-concept from an enterprise deployment. The central claim of this article: the semantic layer is the trust boundary for AI . If your metrics aren’t governed, versioned, and certified, then your AI agents are making financial decisions based on unvalidated definitions. That’s not an AI problem. That’s a governance failure. A Semantic View is more than a database view with a description attached. It’s a governed business interface that declares: When Cortex Context the AI orchestration engine needs to answer “What was Q3 revenue by region?”, it doesn’t scan raw tables and infer meaning. It reads the Semantic View, finds the certified total revenue metric, understands that Q3 maps to the fiscal calendar starting in February, knows the join path to the customer dimension for regional breakdown, and generates a governed query. Without Semantic Views, AI guesses. With them, AI knows. The first design decision: how to partition your semantic layer. We strongly recommend domain orientation over monolithic models: Why domain orientation: Architect Note:Start with revenue, ARR, churn, and active users. These four metrics typically generate the highest semantic confusion and the fastest ROI. Don’t try to model your entire business on day one. Here’s a production-grade Semantic View for the Finance revenue domain. Every decision is annotated. name: finance revenuedescription: Certified revenue metrics for the Finance domain. All revenue figures follow ASC 606 recognition rules. Fiscal year starts February 1. All amounts in USD after currency conversion at daily spot rates.tables: - name: fct revenue base table: analytics.marts.fct revenue description: Fact table containing recognized revenue transactions. One row per revenue recognition event. Grain: transaction-level. columns: - name: revenue id description: Unique identifier for each revenue recognition event data type: VARCHAR - name: recognized date description: Date revenue was recognized per ASC 606 rules. This is NOT the booking date or invoice date. data type: DATE - name: amount usd description: Revenue amount in USD. Converted from local currency at daily spot rate on recognition date. data type: NUMBER 18,2 - name: customer id description: Foreign key to dim customer data type: VARCHAR - name: product line description: Product categorization. Values: Platform, Services, Add-ons. Platform = core subscription. Services = professional services. Add-ons = optional product extensions. data type: VARCHAR - name: is recurring description: TRUE if revenue is from a subscription contract. FALSE for one-time services, implementation fees, overages. data type: BOOLEAN - name: contract id description: Source contract identifier from billing system data type: VARCHAR - name: dim customer base table: analytics.marts.dim customer description: Customer dimension. One row per customer account. Segmentation based on current ARR tier. columns: - name: customer id data type: VARCHAR - name: customer name data type: VARCHAR - name: segment description: Enterprise: ARR $100K. Mid-Market: ARR $25K-$100K. SMB: ARR < $25K. Segments reassessed quarterly. data type: VARCHAR - name: region description: Geographic region based on billing address. Values: AMER, EMEA, APAC, LATAM. data type: VARCHAR - name: industry description: NAICS-based industry classification data type: VARCHARrelationships: - name: revenue to customer left table: fct revenue right table: dim customer join type: many to one on: - left column: customer id right column: customer id description: Every revenue event belongs to exactly one customer. Orphan revenue records customer id IS NULL are excluded from all metrics and investigated by data quality team.metrics: - name: total revenue description: Total recognized revenue in USD across all product lines. Includes recurring and non-recurring. Excludes deferred revenue, credits, and refunds handled in separate metrics . expression: SUM fct revenue.amount usd certified: true owner: finance-data-team@company.com last certified: "2025-12-15" - name: recurring revenue description: Revenue from subscription contracts only is recurring = TRUE . This is the numerator for MRR and ARR calculations. Does NOT include overage charges even on subscription contracts. expression: SUM CASE WHEN fct revenue.is recurring = TRUE THEN fct revenue.amount usd ELSE 0 END certified: true owner: finance-data-team@company.com last certified: "2025-12-15" - name: arr description: Annual Recurring Revenue. Current month's recurring revenue multiplied by 12. Point-in-time metric - always computed from the most recent COMPLETE month. Do not use partial month data. This is the company's primary growth metric reported to the board. expression: SUM CASE WHEN fct revenue.is recurring = TRUE THEN fct revenue.amount usd ELSE 0 END 12 default filters: - column: recognized date operator: between value: "FIRST DAY OF PREVIOUS COMPLETE MONTH to LAST DAY OF PREVIOUS COMPLETE MONTH" certified: true owner: cfo-office@company.com last certified: "2025-12-01" - name: mrr description: Monthly Recurring Revenue. Same as recurring revenue when filtered to a single month. Reported monthly to investors. expression: SUM CASE WHEN fct revenue.is recurring = TRUE THEN fct revenue.amount usd ELSE 0 END default filters: - column: recognized date operator: within value: "MOST RECENT COMPLETE MONTH" certified: true owner: finance-data-team@company.com - name: gross margin description: Total Revenue - COGS / Total Revenue. COGS sourced from finance costs semantic view. Cross-domain metric - requires join to costs domain. expression: SUM fct revenue.amount usd - SUM costs.cogs amount / NULLIF SUM fct revenue.amount usd , 0 certified: true owner: finance-data-team@company.com dependencies: - semantic view: finance costs metric: total cogs - name: net revenue retention description: NRR measures revenue retained and expanded from the existing customer base. Trailing 12-month calculation. Denominator: recurring revenue from customers active 12 months ago. Numerator: current recurring revenue from those same customers. Target: 110% for Enterprise, 100% for SMB. expression: SUM current period recurring / NULLIF SUM base period recurring , 0 certified: true owner: finance-data-team@company.comtime dimensions: - name: recognized date table: fct revenue column: recognized date time grains: day, week, month, quarter, year fiscal year start month: 2 description: Primary time dimension. Fiscal year starts February 1. Q1 = Feb-Apr, Q2 = May-Jul, Q3 = Aug-Oct, Q4 = Nov-Jan.filters: - name: product line table: fct revenue column: product line description: Filter by product line Platform, Services, Add-ons - name: customer segment table: dim customer column: segment description: Filter by customer segment Enterprise, Mid-Market, SMB - name: region table: dim customer column: region description: Filter by geographic region AMER, EMEA, APAC, LATAM - name: industry table: dim customer column: industry description: Filter by customer industry Why so much description text? Because Cortex Context reads these descriptions when resolving ambiguity. When a user asks “What’s our subscription revenue?”, Cortex Context needs enough context in the metric description to determine that recurring revenue is the right metric — not total revenue. Sparse descriptions force AI to guess. Why explicit fiscal calendar rules? Because “Q3” means different things in different fiscal calendars. With February fiscal year start, Q3 is August through October. Without this declaration, Cortex Context would default to calendar Q3 July-September and return wrong numbers to the CEO. Why declare join relationships explicitly? Because AI agents are notorious for generating incorrect multi-table joins. By declaring that fct revenue.customer id joins to dim customer.customer id as a many-to-one relationship, we prevent Cortex Context from ever generating a query that produces row duplication through a bad join. Churn is the metric most likely to have competing definitions in your organization. Here’s how to handle it: name: product churndescription: Customer churn and retention metrics based on PRODUCT ACTIVITY. A customer is churned if they have zero platform activity no logins, no API calls, no data processed for 90 consecutive days. This is the PRODUCT definition of churn.IMPORTANT: This differs from the SALES definition of churn contract non-renewal . For contract-based churn, see the sales churn semantic view owned by revenue-ops@company.com. Both definitions are valid for different analytical purposes. When in doubt, ask the user which definition they need.metrics: - name: churned customers description: Count of customers with zero platform activity for 90+ consecutive days. Excludes: 1 customers in first 30 days of onboarding, 2 accounts flagged as seasonal by Customer Success, 3 accounts in planned migration/downtime. expression: COUNT DISTINCT CASE WHEN days since last activity 90 AND days since signup 30 AND NOT is seasonal account AND NOT is in planned downtime THEN customer id END certified: true owner: product-analytics@company.com last certified: "2025-11-20" - name: churn rate description: Monthly gross churn rate. Denominator: customers active at start of month. Numerator: customers who crossed the 90-day inactivity threshold during that month. Expressed as a percentage. Industry benchmark for B2B SaaS: <2% monthly. expression: churned customers in period / NULLIF active customers at period start, 0 certified: true owner: product-analytics@company.com Notice the explicit callout: “This differs from the SALES definition.” This is intentional. When Cortex Context encounters a question about churn, it needs to understand that multiple valid definitions exist and which one to use — or when to ask the user for clarification. A metric without certification is just an opinion with SQL. Certification is what transforms a metric from “something an analyst wrote” into “the organization’s agreed-upon truth.” Before a metric can be certified: CREATE OR REPLACE TABLE governance.semantic.metric certifications metric name VARCHAR NOT NULL, semantic view VARCHAR NOT NULL, domain VARCHAR NOT NULL, business definition TEXT NOT NULL, technical expression TEXT NOT NULL, certified by VARCHAR NOT NULL, certification date TIMESTAMP NTZ NOT NULL, next review date TIMESTAMP NTZ NOT NULL, status VARCHAR NOT NULL DEFAULT 'draft', deprecation reason TEXT, replacement metric VARCHAR, CONSTRAINT valid status CHECK status IN 'draft', 'in review', 'certified', 'deprecated' ;CREATE OR REPLACE TABLE governance.semantic.certification audit audit id VARCHAR DEFAULT UUID STRING , metric name VARCHAR NOT NULL, semantic view VARCHAR NOT NULL, action VARCHAR NOT NULL, performed by VARCHAR NOT NULL, action timestamp TIMESTAMP NTZ DEFAULT CURRENT TIMESTAMP , previous definition TEXT, new definition TEXT, reason TEXT ; Production Reality:Certification isn’t a one-time event. It’s a cadence. We recommend quarterly reviews for all certified metrics. Market conditions change, business rules evolve, and what was true last quarter may not be true today. Build the review cadence into your governance council’s calendar from day one. Traditional governance focuses on access: who can see which tables. Semantic governance extends this to meaning: who can define what data means, and how those definitions flow to consumers. Level Question Traditional Approach Horizon Context Approach 1. Access Who can see the data? RBAC, row policies RBAC + semantic view permissions 2. Quality Is the data correct? DMFs, tests DMFs + semantic validation tests 3. Semantics What does it mean? Wiki pages, glossaries Certified Semantic Views 4. AI How is meaning consumed by AI? nothing Agent-metric binding, confidence thresholds Most organizations operate at Levels 1–2. Horizon Context enables Levels 3–4 — which is where AI trust lives. It’s not sufficient to control table access. You must control semantic definition access: -- Governance role hierarchy for semantic assetsCREATE ROLE IF NOT EXISTS semantic author; -- Can create/modify views in devCREATE ROLE IF NOT EXISTS semantic certifier; -- Can approve for productionCREATE ROLE IF NOT EXISTS semantic consumer; -- Can query certified viewsCREATE ROLE IF NOT EXISTS semantic admin; -- Full controlGRANT ROLE semantic author TO ROLE analytics engineering;GRANT ROLE semantic certifier TO ROLE data governance council;GRANT ROLE semantic consumer TO ROLE cortex agents;GRANT ROLE semantic admin TO ROLE data platform admin; Access policies apply at the semantic layer — meaning AI agents automatically respect them without needing policy-aware prompts: CREATE OR REPLACE ROW ACCESS POLICY governance.policies.region accessAS region VARCHAR RETURNS BOOLEAN - IS ROLE IN SESSION 'FINANCE ADMIN' OR region = CURRENT SESSION CONTEXT 'REGION' ;ALTER TABLE analytics.marts.fct revenue ADD ROW ACCESS POLICY governance.policies.region access ON region ; When the EMEA Sales Director asks the Finance Agent “What was Q3 revenue?”, they see $41.1M EMEA only . When the CFO asks the same question, they see $142.3M global . Same agent, same Semantic View, same metric — different access context. No prompt engineering required. Clear ownership prevents the “everybody’s job is nobody’s job” failure mode: | Activity | Data Governance | Data Engineering | Analytics Engineering | BI Teams | AI Engineering | Security || -------------------- | :-------------: | :--------------: | :-------------------: | :------: | :------------: | :------: || Define Metrics | C | I | R | C | I | I || Certify Metrics | C | I | I | R | I | I || Build Semantic Views | I | C | R | C | I | I || Configure Connectors | I | R | C | I | I | I || Manage Lineage | C | R | C | I | I | I || Build AI Agents | I | C | I | I | R | I || Set Access Policies | C | I | I | I | I | R || Monitor Usage | R | C | C | C | C | I || Deprecate Metrics | A | I | R | C | I | I | Snowflake Horizon isn’t owned by a single team — it’s a team sport. What happens: Team creates a single Semantic View with 200 metrics, 50 tables, and 30 relationships. It becomes unmaintainable. Nobody can certify it because nobody understands all of it. Fix: Domain-oriented models. One Semantic View per business domain. Max 15–20 metrics per view. Cross-domain metrics reference other views explicitly. What happens: Metric description says “SUM amount usd ” — which is the expression, not the definition. AI can’t distinguish this metric from any other sum. Fix: Every description must answer: What’s included? What’s excluded? What are the boundary conditions? Who should use this vs. an alternative metric? What happens: Semantic View doesn’t declare relationships. AI infers joins from column names customer id matches customer id . Sometimes this produces Cartesian products on tables with composite keys. Fix: Declare every relationship explicitly with join type one-to-one, many-to-one, many-to-many . AI should never infer join paths. What happens: Metric works correctly for “last month” but fails for “Q3” because fiscal calendar isn’t declared. AI defaults to calendar quarters. Fix: Always declare fiscal year start month, supported time grains, and any custom period logic e.g., 4–4–5 retail calendar . What happens: Team certifies metrics before BI teams have validated them against existing dashboards. Inconsistencies emerge post-certification, eroding trust. Fix: Certification requires sign-off from at least one consuming team. Run parallel: compute the metric from both the Semantic View and the existing BI tool. They must match within a defined tolerance before certification. 1. Version your semantic models in Git. Semantic Views are code. They need branches, pull requests, reviews, and rollback capability. Never edit production semantic views directly. 2. Test against closed books. For financial metrics, validate against the Finance team’s system of record after each month-end close. Automated CI tests should catch drift before it reaches production. 3. Deprecate gracefully. When a metric definition changes, don’t delete the old one. Mark it deprecated with a 30-day sunset window. Update the description to point to the replacement. Notify consuming teams. 4. Start narrow, go deep. It’s better to have 5 perfectly certified metrics than 50 draft metrics. Quality compounds trust. Quantity without governance compounds confusion. 5. Make AI agents ask, not guess. Configure confidence thresholds. When Cortex Context’s semantic resolution confidence is below 80% e.g., ambiguous between two metrics , the agent should ask the user for clarification rather than choosing one silently. Lesson Learned:The hardest conversation in semantic modeling isn’t about SQL. It’s getting the VP of Sales and the CFO to agree on one number. Plan for this. Budget calendar time for alignment meetings. The Semantic View will reflect whatever political resolution emerges — and that’s fine. The point is to make the resolution explicit and discoverable, not hidden in a spreadsheet somewhere. Production Reality:After deploying semantic views for revenue metrics, one enterprise discovered 23 different calculated fields across Tableau workbooks that each defined “revenue” slightly differently. Seventeen of them were producing numbers that appeared in executive reports. The metadata connector surfaced these discrepancies in week one — before any AI agent was deployed. The governance value of connectors alone justified the project. Architect Note:Don’t wait for perfect definitions to start building. Ship a “v1” Semantic View with your best understanding. Let consuming teams validate it. Iterate. Certification is the end of a process, not a prerequisite for starting. Draft views are useful — they just shouldn’t power production AI agents. Next in Part 3: We activate the semantic layer for AI — building Cortex Agents, configuring Cortex Context resolution flows, integrating MCP for external AI systems, implementing DevOps for semantic assets, and deploying the full observability stack for production monitoring. If this was useful — give it a clap 👏 , share it with someone building on Snowflake, and subscribe to Snowflake Chronicles so you don’t miss the next one. Every piece is practitioner-first and production-focused. Let’s connect on LinkedIn . Disclaimer: All views expressed are my own and do not reflect those of any current or former employer. Follow Snowflake Chronicles for production-ready implementation guides, architecture deep dives, and practical lessons learned from enterprise data platforms. Building a Trusted Semantic Layer with Snowflake Horizon Context https://pub.towardsai.net/building-a-trusted-semantic-layer-with-snowflake-horizon-context-20588c3a304b was originally published in Towards AI https://pub.towardsai.net on Medium, where people are continuing the conversation by highlighting and responding to this story.