In Part 1, we established why enterprise AI struggles: context fragmentation means the same question yields different answers depending on which system resolves it. Snowflake Horizon Context addresses this architecturally — but architecture without implementation is just a diagram on a whiteboard.
This article is where we build. We’ll design domain-oriented Semantic Views, implement metric certification workflows, establish governance at the meaning layer, define organizational responsibilities, and walk through the production patterns that separate a proof-of-concept from an enterprise deployment.
The central claim of this article: the semantic layer is the trust boundary for AI. If your metrics aren’t governed, versioned, and certified, then your AI agents are making financial decisions based on unvalidated definitions. That’s not an AI problem. That’s a governance failure.
A Semantic View is more than a database view with a description attached. It’s a governed business interface that declares:
When Cortex Context (the AI orchestration engine) needs to answer “What was Q3 revenue by region?”, it doesn’t scan raw tables and infer meaning. It reads the Semantic View, finds the certified total_revenue metric, understands that Q3 maps to the fiscal calendar starting in February, knows the join path to the customer dimension for regional breakdown, and generates a governed query.
Without Semantic Views, AI guesses. With them, AI knows.
The first design decision: how to partition your semantic layer.
We strongly recommend domain orientation over monolithic models:
Why domain orientation:
Architect Note:Start with revenue, ARR, churn, and active users. These four metrics typically generate the highest semantic confusion and the fastest ROI. Don’t try to model your entire business on day one.
Here’s a production-grade Semantic View for the Finance revenue domain. Every decision is annotated.
name: finance_revenuedescription: > Certified revenue metrics for the Finance domain. All revenue figures follow ASC 606 recognition rules. Fiscal year starts February 1. All amounts in USD after currency conversion at daily spot rates.tables: - name: fct_revenue base_table: analytics.marts.fct_revenue description: > Fact table containing recognized revenue transactions. One row per revenue recognition event. Grain: transaction-level. columns: - name: revenue_id description: Unique identifier for each revenue recognition event data_type: VARCHAR - name: recognized_date description: > Date revenue was recognized per ASC 606 rules. This is NOT the booking date or invoice date. data_type: DATE - name: amount_usd description: > Revenue amount in USD. Converted from local currency at daily spot rate on recognition date. data_type: NUMBER(18,2) - name: customer_id description: Foreign key to dim_customer data_type: VARCHAR - name: product_line description: > Product categorization. Values: Platform, Services, Add-ons. Platform = core subscription. Services = professional services. Add-ons = optional product extensions. data_type: VARCHAR - name: is_recurring description: > TRUE if revenue is from a subscription contract. FALSE for one-time services, implementation fees, overages. data_type: BOOLEAN - name: contract_id description: Source contract identifier from billing system data_type: VARCHAR - name: dim_customer base_table: analytics.marts.dim_customer description: > Customer dimension. One row per customer account. Segmentation based on current ARR tier. columns: - name: customer_id data_type: VARCHAR - name: customer_name data_type: VARCHAR - name: segment description: > Enterprise: ARR > $100K. Mid-Market: ARR $25K-$100K. SMB: ARR < $25K. Segments reassessed quarterly. data_type: VARCHAR - name: region description: > Geographic region based on billing address. Values: AMER, EMEA, APAC, LATAM. data_type: VARCHAR - name: industry description: NAICS-based industry classification data_type: VARCHARrelationships: - name: revenue_to_customer left_table: fct_revenue right_table: dim_customer join_type: many_to_one on: - left_column: customer_id right_column: customer_id description: > Every revenue event belongs to exactly one customer. Orphan revenue records (customer_id IS NULL) are excluded from all metrics and investigated by data quality team.metrics: - name: total_revenue description: > Total recognized revenue in USD across all product lines. Includes recurring and non-recurring. Excludes deferred revenue, credits, and refunds (handled in separate metrics). expression: SUM(fct_revenue.amount_usd) certified: true owner: finance-data-team@company.com last_certified: "2025-12-15" - name: recurring_revenue description: > Revenue from subscription contracts only (is_recurring = TRUE). This is the numerator for MRR and ARR calculations. Does NOT include overage charges even on subscription contracts. expression: > SUM(CASE WHEN fct_revenue.is_recurring = TRUE THEN fct_revenue.amount_usd ELSE 0 END) certified: true owner: finance-data-team@company.com last_certified: "2025-12-15" - name: arr description: > Annual Recurring Revenue. Current month's recurring revenue multiplied by 12. Point-in-time metric - always computed from the most recent COMPLETE month. Do not use partial month data. This is the company's primary growth metric reported to the board. expression: > SUM(CASE WHEN fct_revenue.is_recurring = TRUE THEN fct_revenue.amount_usd ELSE 0 END) * 12 default_filters: - column: recognized_date operator: between value: "FIRST DAY OF PREVIOUS COMPLETE MONTH to LAST DAY OF PREVIOUS COMPLETE MONTH" certified: true owner: cfo-office@company.com last_certified: "2025-12-01" - name: mrr description: > Monthly Recurring Revenue. Same as recurring_revenue when filtered to a single month. Reported monthly to investors. expression: > SUM(CASE WHEN fct_revenue.is_recurring = TRUE THEN fct_revenue.amount_usd ELSE 0 END) default_filters: - column: recognized_date operator: within value: "MOST RECENT COMPLETE MONTH" certified: true owner: finance-data-team@company.com - name: gross_margin description: > (Total Revenue - COGS) / Total Revenue. COGS sourced from finance_costs semantic view. Cross-domain metric - requires join to costs domain. expression: > (SUM(fct_revenue.amount_usd) - SUM(costs.cogs_amount)) / NULLIF(SUM(fct_revenue.amount_usd), 0) certified: true owner: finance-data-team@company.com dependencies: - semantic_view: finance_costs metric: total_cogs - name: net_revenue_retention description: > NRR measures revenue retained and expanded from the existing customer base. Trailing 12-month calculation. Denominator: recurring revenue from customers active 12 months ago. Numerator: current recurring revenue from those same customers. Target: >110% for Enterprise, >100% for SMB. expression: > SUM(current_period_recurring) / NULLIF(SUM(base_period_recurring), 0) certified: true owner: finance-data-team@company.comtime_dimensions: - name: recognized_date table: fct_revenue column: recognized_date time_grains: [day, week, month, quarter, year] fiscal_year_start_month: 2 description: > Primary time dimension. Fiscal year starts February 1. Q1 = Feb-Apr, Q2 = May-Jul, Q3 = Aug-Oct, Q4 = Nov-Jan.filters: - name: product_line table: fct_revenue column: product_line description: Filter by product line (Platform, Services, Add-ons) - name: customer_segment table: dim_customer column: segment description: Filter by customer segment (Enterprise, Mid-Market, SMB) - name: region table: dim_customer column: region description: Filter by geographic region (AMER, EMEA, APAC, LATAM) - name: industry table: dim_customer column: industry description: Filter by customer industry
Why so much description text? Because Cortex Context reads these descriptions when resolving ambiguity. When a user asks “What’s our subscription revenue?”, Cortex Context needs enough context in the metric description to determine that recurring_revenue is the right metric — not total_revenue. Sparse descriptions force AI to guess.
Why explicit fiscal calendar rules? Because “Q3” means different things in different fiscal calendars. With February fiscal year start, Q3 is August through October. Without this declaration, Cortex Context would default to calendar Q3 (July-September) and return wrong numbers to the CEO.
Why declare join relationships explicitly? Because AI agents are notorious for generating incorrect multi-table joins. By declaring that fct_revenue.customer_id joins to dim_customer.customer_id as a many-to-one relationship, we prevent Cortex Context from ever generating a query that produces row duplication through a bad join.
Churn is the metric most likely to have competing definitions in your organization. Here’s how to handle it:
name: product_churndescription: > Customer churn and retention metrics based on PRODUCT ACTIVITY. A customer is churned if they have zero platform activity (no logins, no API calls, no data processed) for 90 consecutive days. This is the PRODUCT definition of churn.IMPORTANT: This differs from the SALES definition of churn (contract non-renewal). For contract-based churn, see the sales_churn semantic view owned by revenue-ops@company.com. Both definitions are valid for different analytical purposes. When in doubt, ask the user which definition they need.metrics: - name: churned_customers description: > Count of customers with zero platform activity for 90+ consecutive days. Excludes: (1) customers in first 30 days of onboarding, (2) accounts flagged as seasonal by Customer Success, (3) accounts in planned migration/downtime. expression: > COUNT(DISTINCT CASE WHEN days_since_last_activity > 90 AND days_since_signup > 30 AND NOT is_seasonal_account AND NOT is_in_planned_downtime THEN customer_id END) certified: true owner: product-analytics@company.com last_certified: "2025-11-20" - name: churn_rate description: > Monthly gross churn rate. Denominator: customers active at start of month. Numerator: customers who crossed the 90-day inactivity threshold during that month. Expressed as a percentage. Industry benchmark for B2B SaaS: <2% monthly. expression: > churned_customers_in_period / NULLIF(active_customers_at_period_start, 0) certified: true owner: product-analytics@company.com
Notice the explicit callout: “This differs from the SALES definition.” This is intentional. When Cortex Context encounters a question about churn, it needs to understand that multiple valid definitions exist and which one to use — or when to ask the user for clarification.
A metric without certification is just an opinion with SQL. Certification is what transforms a metric from “something an analyst wrote” into “the organization’s agreed-upon truth.”
Before a metric can be certified:
CREATE OR REPLACE TABLE governance.semantic.metric_certifications ( metric_name VARCHAR NOT NULL, semantic_view VARCHAR NOT NULL, domain VARCHAR NOT NULL, business_definition TEXT NOT NULL, technical_expression TEXT NOT NULL, certified_by VARCHAR NOT NULL, certification_date TIMESTAMP_NTZ NOT NULL, next_review_date TIMESTAMP_NTZ NOT NULL, status VARCHAR NOT NULL DEFAULT 'draft', deprecation_reason TEXT, replacement_metric VARCHAR, CONSTRAINT valid_status CHECK (status IN ('draft', 'in_review', 'certified', 'deprecated')));CREATE OR REPLACE TABLE governance.semantic.certification_audit ( audit_id VARCHAR DEFAULT UUID_STRING(), metric_name VARCHAR NOT NULL, semantic_view VARCHAR NOT NULL, action VARCHAR NOT NULL, performed_by VARCHAR NOT NULL, action_timestamp TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(), previous_definition TEXT, new_definition TEXT, reason TEXT);
Production Reality:Certification isn’t a one-time event. It’s a cadence. We recommend quarterly reviews for all certified metrics. Market conditions change, business rules evolve, and what was true last quarter may not be true today. Build the review cadence into your governance council’s calendar from day one.
Traditional governance focuses on access: who can see which tables. Semantic governance extends this to meaning: who can define what data means, and how those definitions flow to consumers.
Level Question Traditional Approach Horizon Context Approach 1. Access Who can see the data? RBAC, row policies RBAC + semantic view permissions 2. Quality Is the data correct? DMFs, tests DMFs + semantic validation tests 3. Semantics What does it mean? Wiki pages, glossaries Certified Semantic Views 4. AI How is meaning consumed by AI? (nothing) Agent-metric binding, confidence thresholds
Most organizations operate at Levels 1–2. Horizon Context enables Levels 3–4 — which is where AI trust lives.
It’s not sufficient to control table access. You must control semantic definition access:
-- Governance role hierarchy for semantic assetsCREATE ROLE IF NOT EXISTS semantic_author; -- Can create/modify views in devCREATE ROLE IF NOT EXISTS semantic_certifier; -- Can approve for productionCREATE ROLE IF NOT EXISTS semantic_consumer; -- Can query certified viewsCREATE ROLE IF NOT EXISTS semantic_admin; -- Full controlGRANT ROLE semantic_author TO ROLE analytics_engineering;GRANT ROLE semantic_certifier TO ROLE data_governance_council;GRANT ROLE semantic_consumer TO ROLE cortex_agents;GRANT ROLE semantic_admin TO ROLE data_platform_admin;
Access policies apply at the semantic layer — meaning AI agents automatically respect them without needing policy-aware prompts:
CREATE OR REPLACE ROW ACCESS POLICY governance.policies.region_accessAS (region VARCHAR) RETURNS BOOLEAN -> IS_ROLE_IN_SESSION('FINANCE_ADMIN') OR region = CURRENT_SESSION_CONTEXT('REGION');ALTER TABLE analytics.marts.fct_revenue ADD ROW ACCESS POLICY governance.policies.region_access ON (region);
When the EMEA Sales Director asks the Finance Agent “What was Q3 revenue?”, they see $41.1M (EMEA only). When the CFO asks the same question, they see $142.3M (global). Same agent, same Semantic View, same metric — different access context. No prompt engineering required.
Clear ownership prevents the “everybody’s job is nobody’s job” failure mode:
| Activity | Data Governance | Data Engineering | Analytics Engineering | BI Teams | AI Engineering | Security || -------------------- | :-------------: | :--------------: | :-------------------: | :------: | :------------: | :------: || Define Metrics | C | I | R | C | I | I || Certify Metrics | C | I | I | R | I | I || Build Semantic Views | I | C | R | C | I | I || Configure Connectors | I | R | C | I | I | I || Manage Lineage | C | R | C | I | I | I || Build AI Agents | I | C | I | I | R | I || Set Access Policies | C | I | I | I | I | R || Monitor Usage | R | C | C | C | C | I || Deprecate Metrics | A | I | R | C | I | I |
Snowflake Horizon isn’t owned by a single team — it’s a team sport.
What happens: Team creates a single Semantic View with 200 metrics, 50 tables, and 30 relationships. It becomes unmaintainable. Nobody can certify it because nobody understands all of it.
Fix: Domain-oriented models. One Semantic View per business domain. Max 15–20 metrics per view. Cross-domain metrics reference other views explicitly.
What happens: Metric description says “SUM(amount_usd)” — which is the expression, not the definition. AI can’t distinguish this metric from any other sum.
Fix: Every description must answer: What’s included? What’s excluded? What are the boundary conditions? Who should use this vs. an alternative metric?
What happens: Semantic View doesn’t declare relationships. AI infers joins from column names (customer_id matches customer_id). Sometimes this produces Cartesian products on tables with composite keys.
Fix: Declare every relationship explicitly with join type (one-to-one, many-to-one, many-to-many). AI should never infer join paths.
What happens: Metric works correctly for “last month” but fails for “Q3” because fiscal calendar isn’t declared. AI defaults to calendar quarters.
Fix: Always declare fiscal year start month, supported time grains, and any custom period logic (e.g., 4–4–5 retail calendar).
What happens: Team certifies metrics before BI teams have validated them against existing dashboards. Inconsistencies emerge post-certification, eroding trust.
Fix: Certification requires sign-off from at least one consuming team. Run parallel: compute the metric from both the Semantic View and the existing BI tool. They must match within a defined tolerance before certification.
1. Version your semantic models in Git. Semantic Views are code. They need branches, pull requests, reviews, and rollback capability. Never edit production semantic views directly.
2. Test against closed books. For financial metrics, validate against the Finance team’s system of record after each month-end close. Automated CI tests should catch drift before it reaches production.
3. Deprecate gracefully. When a metric definition changes, don’t delete the old one. Mark it deprecated with a 30-day sunset window. Update the description to point to the replacement. Notify consuming teams.
4. Start narrow, go deep. It’s better to have 5 perfectly certified metrics than 50 draft metrics. Quality compounds trust. Quantity without governance compounds confusion.
5. Make AI agents ask, not guess. Configure confidence thresholds. When Cortex Context’s semantic resolution confidence is below 80% (e.g., ambiguous between two metrics), the agent should ask the user for clarification rather than choosing one silently.
Lesson Learned:The hardest conversation in semantic modeling isn’t about SQL. It’s getting the VP of Sales and the CFO to agree on one number. Plan for this. Budget calendar time for alignment meetings. The Semantic View will reflect whatever political resolution emerges — and that’s fine. The point is to make the resolution explicit and discoverable, not hidden in a spreadsheet somewhere.
Production Reality:After deploying semantic views for revenue metrics, one enterprise discovered 23 different calculated fields across Tableau workbooks that each defined “revenue” slightly differently. Seventeen of them were producing numbers that appeared in executive reports. The metadata connector surfaced these discrepancies in week one — before any AI agent was deployed. The governance value of connectors alone justified the project.
Architect Note:Don’t wait for perfect definitions to start building. Ship a “v1” Semantic View with your best understanding. Let consuming teams validate it. Iterate. Certification is the end of a process, not a prerequisite for starting. Draft views are useful — they just shouldn’t power production AI agents.
Next in Part 3: We activate the semantic layer for AI — building Cortex Agents, configuring Cortex Context resolution flows, integrating MCP for external AI systems, implementing DevOps for semantic assets, and deploying the full observability stack for production monitoring.
If this was useful — give it a clap 👏, share it with someone building on Snowflake, and subscribe to Snowflake Chronicles so you don’t miss the next one. Every piece is practitioner-first and production-focused. Let’s connect on LinkedIn.
Disclaimer: All views expressed are my own and do not reflect those of any current or former employer.
Follow Snowflake Chronicles for production-ready implementation guides, architecture deep dives, and practical lessons learned from enterprise data platforms.
Building a Trusted Semantic Layer with Snowflake Horizon Context was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.