# The Anatomy of a Machine's Mind - Decoding AEO, GEO

> Source: <https://dev.to/palash_bagchi_cbdebd259d4/the-anatomy-of-a-machines-mind-decoding-aeo-geo-2kdm>
> Published: 2026-06-06 07:10:53+00:00

We are moving away from traditional "10 blue links" (where Google ranks a document) to a semantic synthesis model (where Google extracts the factual payload and generates the answer directly via AI Overviews or Gemini).

To architect a dashboard for AEO and GEO, we must stop looking at keyword density and start looking at **Entity Salience** and **RAG (Retrieval-Augmented Generation) compatibility**.

Here is the architectural breakdown of the Google APIs required to track, test, and optimize for the Generative Search era.

Answer Engine Optimization relies heavily on Google's Knowledge Graph. If Google doesn't recognize your brand, product, or author as a definitive "Entity," you will not appear in Knowledge Panels, nor will an LLM trust your brand as a source of truth.

This API lets you query Google’s exact semantic database to see how it mathematically maps entities (people, places, organizations).

`resultScore`

: The algorithmic confidence Google has in the entity match.`@id`

(Machine-Readable Entity ID or MREID): The unique identifier (e.g., `/m/0k8z`

) Google assigns to a recognized entity.`description`

/ `detailedDescription`

: The exact factual payload Google associates with that entity.

**The Enrichment Play (Brand Authority):** You can programmatically query your brand name or executive team names monthly. If your `resultScore`

is increasing, your AEO efforts (digital PR, schema markup, Wikipedia/Wikidata editing) are working. If your brand returns no MREID, you are invisible to the Answer Engine.

This is the exact right place to start. If you do not understand how Google mathematically defines reality, all downstream Answer Engine Optimization (AEO) efforts are essentially guessing.

When we talk about the Google Knowledge Graph Search API, we are no longer dealing with web pages, URLs, or HTML. We are dealing with **Nodes (Entities)** and **Edges (Relationships)**.

Here is the microscopic breakdown of how Google categorizes, measures, and scores reality.

In traditional SEO, "Kakunin" is just a string of letters (a keyword). In the Knowledge Graph, an **Entity** is a fundamental unit of knowledge—a specific, identifiable thing.

Google does not use arbitrary labels to define these; it strictly adheres to the **Schema.org vocabulary**.

`schema.org/Person`

(e.g., Taylor Swift, or a company's CEO).`schema.org/Organization`

or sub-types like `schema.org/LocalBusiness`

or `schema.org/Corporation`

(e.g., Google, Kakunin).`schema.org/Place`

(e.g., Ranchi, Eiffel Tower).**The Practical Benchmark:** When does a brand cross the threshold from being a "keyword" on a webpage to a recognized "Entity" in the Knowledge Graph?

The benchmark is **reconciliation**. Google’s Entity Reconciliation engine constantly scrapes the web. When it finds enough corroborating "Semantic Triples" (Subject-Predicate-Object data points, like *Kakunin -> is a -> SoftwarePlatform*), it clusters that data together. You have practically achieved Entity status the moment Google mints a unique machine identifier for you in its database.

When you query the Knowledge Graph API, it returns a JSON-LD payload. Here is what those specific data points actually mean and the signals that drive them.

`@id`

(The Machine-Readable Entity ID or MREID)
This is the canonical database key for the entity. It is the most important data point in AEO.

`kg:/m/`

(e.g., `/m/0dl567`

) or `kg:/g/`

.`/m/`

prefix stands for "Machine ID" and is a legacy identifier inherited from Freebase, the massive open-source database Google acquired to build its Knowledge Graph. Newer entities created directly by Google's ML systems often get a `/g/`

prefix.`detailedDescription`

(The Factual Payload)
This is the text that an Answer Engine (like Gemini or AI Overviews) will read as the absolute, verified truth about your entity.

`detailedDescription`

will be completely empty, giving Answer Engines zero factual payload to pull from.`resultScore`

(The Salience & Probability Metric)
This is not a static "authority score" like Domain Rating (DR). It is a dynamic numerical value that describes how perfectly an entity matches the search query context.

`resultScore`

of 30,000, while a lesser-known Australian boxer named Brad Pitt might return a score of 200.`resultScore`

.To visualize how these signals compound to push a brand from a mere "keyword" to a fully reconciled Entity with a high `resultScore`

, I have generated a Knowledge Graph Entity API Simulator below. Adjust the signals to see how the JSON-LD payload mutates in real-time.

`{`

"@context": {

"@vocab": "http://schema.org/",

"goog": "http://schema.googleapis.com/",

"detailedDescription": "goog:detailedDescription",

"resultScore": "goog:resultScore"

},

"@type": "EntitySearchResult",

"result": {

"@id": "kg:/g/11bsled",

"name": "Kakunin",

"@type": [

"Organization"

],

"detailedDescription": {

"articleBody": "Kakunin is an established organization recognized by global semantic authorities.",

"url": "https://en.wikipedia.org/wiki/Kakunin",

"license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License"

}

},

"resultScore": 74375

}

Generative Engine Optimization (GEO) requires your content to be easily parsed by Large Language Models (LLMs). LLMs do not read "keywords"; they calculate the mathematical relationship between words.

To optimize for AI Overviews (formerly SGE), you must feed your content into the same Natural Language Processing (NLP) engines Google uses to train its models.

This API exposes Google's internal machine learning models for syntax analysis, entity extraction, and sentiment analysis.

`entities`

: What nouns/concepts Google extracts from your text.`salience`

: A critical metric (ranging from 0.0 to 1.0) indicating the importance or centrality of an entity to the entire document text.`sentiment.score`

& `sentiment.magnitude`

: How positive, negative, or neutral the text is.

**The Enrichment Play (The Salience Audit):** Before publishing a high-value SaaS landing page, pass the text through the NLP API. If your target product feature has a `salience`

score of 0.12, but a competitor's integration mentioned off-hand has a score of 0.85, the LLM will completely misunderstand the core topic of your page. You must rewrite the syntax—using clearer subject-verb-object structures—until your core product hits a salience score above 0.70.

If Answer Engine Optimization (AEO) is about getting Google to recognize your existence as a factual "Entity" (via the Knowledge Graph), Generative Engine Optimization (GEO) is about **controlling how an LLM reads, fragments, and scores your content.**

Large Language Models (like Gemini or the models powering AI Overviews) do not read pages top-to-bottom like humans, nor do they count keyword frequencies like legacy Googlebot. They convert text into **Semantic Vectors**—lists of numbers representing the mathematical distance between concepts.

To master the GEO layer using the **Google Cloud Natural Language API**, there are three critical sub-engines you must understand, as they directly dictate whether your content is "RAG-friendly" (Retrieval-Augmented Generation).

In traditional SEO, you could put the keyword "MiCA AI Compliance" at the top of the page, write 500 words of fluff, and still rank. In GEO, that will completely fail.

The Natural Language API features a `analyzeSyntax`

method that generates a **Dependency Parse Tree**. It breaks every sentence into tokens (words) and maps the exact grammatical relationship between them (e.g., this noun is the subject, this verb is the root action, this adjective modifies the object).

We briefly touched on `salience`

(how important a word is to the page), but the API also exposes **Entity Sentiment Analysis**. This does not just measure if an article is generally "happy" or "sad"; it measures the exact emotional polarity attached to a *specific entity* within the text.

`score`

: Ranges from `-1.0`

(extremely negative) to `1.0`

(extremely positive).`magnitude`

: Indicates the sheer volume of emotion, regardless of whether it's positive or negative (ranging from `0.0`

to `+inf`

).

**The GEO Application (Competitor Conquesting):** When users ask Gemini, *"Which is better for AI governance, Kakunin or [Competitor]?"*, the engine doesn't just read feature lists. It aggregates the Entity Sentiment of both brands across the web. If your competitor has a higher positive `score`

globally connected to the entity "AI governance", the LLM will confidently recommend them over you.

**The Fix:** When writing comparison pages (e.g., "Kakunin vs. Competitor X"), if you use overly aggressive, negative language against the competitor, the API will attach a high `magnitude`

of negative `score`

to that paragraph. Because LLMs are strictly programmed with safety filters to avoid generating toxic or highly biased text, they will often refuse to cite your comparison page entirely. Your competitive content must be structurally objective and emotionally neutral (`score`

near `0.0`

) to be cited by an Answer Engine.

LLMs have a limited "context window" (how much data they can process at once). To save computing power, before Google feeds a webpage to an LLM to generate an AI Overview, it filters the web using strict taxonomic categories.

The Natural Language API’s `classifyText`

method maps your content against a hardcoded database of over 1,000 specific categories.

`/Computers & Electronics/Enterprise Technology/Data Management`

alongside a `confidence`

score (0.0 to 1.0).`/Business & Industrial/Business Services/Consulting`

or `/Law & Government/Legal`

. However, if your marketing team filled the post with metaphors about "crashing cars" or "paying expensive speeding tickets," the NLP engine might classify the page under `/Autos & Vehicles`

.When you look at this API as a whole, it reveals how you must re-architect your landing pages.

Because LLMs extract data via RAG, they do not ingest your whole webpage. They ingest **Semantic Chunks** (usually a single `<H2>`

header and the 1–2 paragraphs immediately below it).

If you pass a webpage through the Natural Language API, the API reads it linearly.

`<H2>`

is a clever marketing pun (e.g., `Salience`

is low, `Syntax`

is broken (no verbs), and `Entities`

are missing.**To achieve GEO dominance, every single section of your page must be a self-contained factual payload:**

`<H2>How Kakunin Ensures MiCA Compliance</H2>`

).`<table>`

) or structured lists (`<ul>`

) directly underneath the active sentence. LLMs assign incredibly high retrieval weight to HTML tables because the rows and columns already act as a pre-built relational database, requiring zero NLP guesswork.If you optimize for the Natural Language API's Dependency Tree and Content Classifications rather than just "keyword volume," you ensure that when Gemini looks for a factual chunk to fulfill an enterprise search query, your data is the easiest mathematical vector for it to grab.

Here is where the transition from traditional SEO to GEO gets messy. Google is currently injecting AI Overviews at the top of the SERP, but they are highly secretive about the analytics.

`impressions`

, `clicks`

, `ctr`

), but with a specific focus on the `searchAppearance`

dimension.`searchAppearance`

types like `FAQ`

, `HOW_TO`

, or `PRODUCT_SNIPPETS`

. Because AEO heavily relies on Schema.org markup to spoon-feed facts to Google, correlating a rise in these specific rich results with overall CTR is your best proxy for AEO success.`searchAppearance`

filter for "AI Overviews" in GSC. If your site is cited as a source in an AI Overview, the clicks and impressions are simply lumped into standard web traffic.It is exactly like SERP/SEO data measurement because it uses the exact same API endpoints. However, the way you must *interpret* the math is entirely inverted.

If you had asked me this last month, I would have told you that AI Overviews were a complete black box. Now we are looking at a massive, real-time architectural shift. On June 3, Google officially rolled out dedicated "Generative AI Performance" reporting inside Google Search Console.

The blindspot is officially lifting. Here is the fine-grained breakdown of how the Search Console API handles the Generative Engine Optimization (GEO) layer, and why the metrics mean something completely different now.

In traditional SEO, you query the GSC API with the `searchType`

set to `WEB`

(to see standard blue-link traffic) or `IMAGE`

.

Now, the API is being updated to accept new Search Type filters specifically for **AI Overviews** and **AI Mode**. This allows your data pipeline to completely decouple standard human search behavior from machine-synthesized answers.

When you pass this new filter to the API, it returns the standard four metrics (`impressions`

, `clicks`

, `ctr`

, `position`

), but their definitions have radically mutated.

To understand GEO, you have to abandon the traditional SEO dopamine hit of chasing "clicks." Here is how the math changes when you filter the API for AI Overviews:

| Metric | Traditional Web Search | Generative AI Search (AI Overviews) |
|---|---|---|
Impression |
A user scrolled past your blue link on the page. | An LLM successfully extracted your data, synthesized it, and cited your URL. |
Click |
The user chose your link over a competitor's. | The user needed deep technical validation and clicked your citation card. |
CTR |
Standard range is 3% to 15%. | Standard range drops to 0.5% to 3% because the LLM answers the intent inline. |
Position |
Classical ranking list (1 through 10). | Binary variable. You are either embedded in the synthesis block (Position 0) or omitted. |

In the AI Overview context, an "Impression" is the ultimate victory metric. It proves that the Natural Language API (which we discussed earlier) successfully parsed your Semantic Vectors, and the Knowledge Graph recognized your Entity Salience.

When your dashboard logs an AI Impression for a query like "MiCA AI Compliance architecture," it means:

`<h2>`

and active-voice paragraph) as the highest-trust, most mathematically relevant data point available.Because the GSC API now separates this data, you can programmatically track the success of your GEO structural edits.

Here is what that automated workflow looks like:

`searchType: WEB`

. Query B uses the new `searchType: AI_OVERVIEWS`

.`query`

and `page`

.`WEB`

impressions are high, but `AI_OVERVIEW`

impressions are zero.The goal of AEO and GEO is no longer to drive massive top-of-funnel traffic to your site. The goal is **Narrative Control**. If you dominate the AI Overview impressions, you control what the machine tells the world about your industry, even if the user never clicks through to your domain.

Migrating this pipeline from a low-code orchestrator (Make.com) into a native Next.js architecture is exactly how you productionize this for a SaaS environment or a high-performance internal tool.

By building this in Next.js, you eliminate the Airtable dependency, reduce API latency, and gain the ability to render the data in a minimalist, high-contrast dashboard (using your preferred `shadcn/ui`

and Tailwind aesthetic).

Here is the complete full-stack architecture for your Generative Engine Optimization (GEO) God-Mode Dashboard.

Instead of relying on Make.com, we will build a Next.js Route Handler (`app/api/geo-audit/route.ts`

). This single endpoint acts as the orchestrator: it queries the Google Search Console API twice, merges the arrays, scrapes the failing URLs, and runs the semantic chunks through the Google Cloud Natural Language API.

First, install the required server dependencies:

```
npm install googleapis @google-cloud/language cheerio
```

Create the API route: `app/api/geo-audit/route.ts`

``` python
import { google } from 'googleapis';
import language from '@google-cloud/language';
import * as cheerio from 'cheerio';
import { NextResponse } from 'next/server';

// Initialize Google NLP Client
const nlpClient = new language.LanguageServiceClient();

// Initialize Google Search Console Client
const auth = new google.auth.GoogleAuth({
  scopes: ['https://www.googleapis.com/auth/webmasters.readonly'],
});
const searchconsole = google.searchconsole({ version: 'v1', auth });

const SITE_URL = 'https://kakunin.io'; // Replace with your verified GSC property
const TARGET_ENTITY = 'Kakunin';

export async function GET() {
  try {
    const sevenDaysAgo = new Date();
    sevenDaysAgo.setDate(sevenDaysAgo.getDate() - 7);
    const startDate = sevenDaysAgo.toISOString().split('T')[0];
    const endDate = new Date().toISOString().split('T')[0];

    // 1. Fetch Standard Web Traffic
    const webRes = await searchconsole.searchanalytics.query({
      siteUrl: SITE_URL,
      requestBody: {
        startDate,
        endDate,
        dimensions: ['page'],
        searchType: 'web',
        rowLimit: 1000,
      },
    });

    // 2. Fetch AI Overview Traffic (The 2026 Search Appearance filter)
    const aiRes = await searchconsole.searchanalytics.query({
      siteUrl: SITE_URL,
      requestBody: {
        startDate,
        endDate,
        dimensions: ['page'],
        dimensionFilterGroups: [{
          filters: [{ dimension: 'searchAppearance', operator: 'equals', expression: 'AI_OVERVIEWS' }]
        }],
        rowLimit: 1000,
      },
    });

    // 3. Merge the Data (The URL is the Join Key)
    const webData = webRes.data.rows || [];
    const aiData = aiRes.data.rows || [];

    const aiMap = new Map(aiData.map(row => [row.keys![0], row.impressions || 0]));

    const mergedData = webData.map(row => {
      const url = row.keys![0];
      const webImpressions = row.impressions || 0;
      const aiImpressions = aiMap.get(url) || 0;
      const captureRate = webImpressions > 0 ? (aiImpressions / webImpressions) : 0;

      return { url, webImpressions, aiImpressions, captureRate };
    });

    // 4. Filter for Failing Pages (High human traffic, 0 machine traffic)
    const failingPages = mergedData.filter(page => page.webImpressions > 100 && page.aiImpressions === 0);
    const auditResults = [];

    // 5. Scrape & NLP Audit the Failing Pages
for (const page of failingPages) {
  try {
    const response = await fetch(page.url, {
      headers: { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) GEO-Auditor/1.0' }
    });
    const html = await response.text();
    const $ = cheerio.load(html);

    let isOptimized = true;
    let failingChunk = '';
    let recommendation = '';

    // Extract the primary H2 and paragraph block to test the RAG chunk
    const h2Text = $('h2').first().text().trim();
    const pText = $('h2').first().next('p').text().trim();

    if (h2Text && pText) {
      const chunkText = `${h2Text}. ${pText}`;
      failingChunk = chunkText;

      // Execute live calls to Google Cloud NLP
      const [entityRes] = await nlpClient.analyzeEntities({
        document: { content: chunkText, type: 'PLAIN_TEXT' }
      });
      const [syntaxRes] = await nlpClient.analyzeSyntax({
        document: { content: chunkText, type: 'PLAIN_TEXT' },
        encodingType: 'UTF8'
      });

      // Track if our target entity exists and has sufficient salience
      const targetEntityObj = entityRes.entities?.find(e => e.name?.toLowerCase() === TARGET_ENTITY.toLowerCase());
      const salience = targetEntityObj?.salience ?? 0;

      // Track if there are structural syntax issues (e.g., passive voice or massive token distances)
      const hasPassiveVoice = syntaxRes.tokens?.some(t => t.dependencyEdge?.label === 'NSUBJPASS');

      if (!targetEntityObj || salience < 0.4) {
        isOptimized = false;
        recommendation = `Entity '${TARGET_ENTITY}' salience is too low (${salience}). Rewrite the chunk to make your brand the active subject.`;
      } else if (hasPassiveVoice) {
        isOptimized = false;
        recommendation = "Passive voice syntax detected (NSUBJPASS). Convert your sentence structures to direct active voice.";
      }
    } else {
      isOptimized = false;
      recommendation = "Missing semantic HTML structure. Ensure your landing pages use explicit H2 tags followed by paragraph text.";
    }

    auditResults.push({
      url: page.url,
      webImpressions: page.webImpressions,
      status: isOptimized ? 'Optimized' : 'Failing',
      failingChunk: isOptimized ? null : failingChunk,
      recommendation: isOptimized ? null : recommendation
    });

  } catch (e) {
    console.error(`Failed to execute native cloud audit for: ${page.url}`, e);
  }
}
```

To maintain a high-contrast, minimalist, and professional aesthetic, we will build a client component that fetches the API route and renders the data using standard utility classes that mimic `shadcn/ui`

structures (clean white cards, subtle gray borders, and strict typography).

Create the dashboard page: `app/dashboard/geo/page.tsx`

``` js
'use client';

import { useEffect, useState } from 'react';

// Define the shape of our API response
interface AuditResult {
  url: string;
  webImpressions: number;
  status: 'Optimized' | 'Failing';
  failingChunk: string | null;
  recommendation: string | null;
}

export default function GeoDashboard() {
  const [data, setData] = useState<AuditResult[]>([]);
  const [loading, setLoading] = useState(true);

  useEffect(() => {
    async function fetchAudit() {
      const res = await fetch('/api/geo-audit');
      const json = await res.json();
      if (json.success) {
        setData(json.data);
      }
      setLoading(false);
    }
    fetchAudit();
  }, []);

  return (
    <div className="min-h-screen bg-neutral-50 text-slate-900 p-8 font-sans">
      <div className="max-w-6xl mx-auto">

        {/* Header Section */}
        <header className="mb-10">
          <h1 className="text-3xl font-semibold tracking-tight">Generative Engine Optimization</h1>
          <p className="text-slate-500 mt-2">
            Monitoring the semantic vectors and AI Overview capture rates of your highest-traffic pages.
          </p>
        </header>

        {/* Dashboard Card */}
        <div className="bg-white border border-slate-200 rounded-xl shadow-sm overflow-hidden">
          {loading ? (
            <div className="p-12 text-center text-slate-400">Running NLP Semantic Vector Audit...</div>
          ) : (
            <table className="w-full text-left border-collapse">
              <thead>
                <tr className="border-b border-slate-100 bg-slate-50/50 text-sm font-medium text-slate-500">
                  <th className="p-4 pl-6">Landing Page URL</th>
                  <th className="p-4">Web Impressions</th>
                  <th className="p-4">GEO Status</th>
                </tr>
              </thead>
              <tbody className="divide-y divide-slate-100 text-sm">
                {data.map((row, index) => (
                  <tr key={index} className="hover:bg-slate-50 transition-colors">
                    <td className="p-4 pl-6 align-top">
                      <a href={row.url} className="font-medium text-indigo-600 hover:underline">
                        {row.url.replace('https://kakunin.io', '')}
                      </a>

                      {/* Inline Error Reporting for Failing Pages */}
                      {row.status === 'Failing' && row.failingChunk && (
                        <div className="mt-3 p-3 bg-red-50 border border-red-100 rounded-md">
                          <p className="text-xs font-semibold text-red-800 mb-1">Syntactic Breakdown:</p>
                          <p className="text-xs text-red-600 italic">"{row.failingChunk}"</p>
                          <p className="text-xs text-slate-600 mt-2 font-medium">↳ Action: {row.recommendation}</p>
                        </div>
                      )}
                    </td>
                    <td className="p-4 align-top tabular-nums text-slate-600">
                      {row.webImpressions.toLocaleString()}
                    </td>
                    <td className="p-4 align-top">
                      <span className={`inline-flex items-center px-2.5 py-0.5 rounded-full text-xs font-medium ${
                        row.status === 'Optimized' 
                          ? 'bg-emerald-100 text-emerald-800' 
                          : 'bg-red-100 text-red-800'
                      }`}>
                        {row.status}
                      </span>
                    </td>
                  </tr>
                ))}
                {data.length === 0 && (
                  <tr>
                    <td colSpan={3} className="p-8 text-center text-slate-500">
                      All high-traffic pages are passing the AI capture threshold.
                    </td>
                  </tr>
                )}
              </tbody>
            </table>
          )}
        </div>

      </div>
    </div>
  );
}
```

`GET`

route into a background Cron Job using `@vercel/cron`

. The cron job runs every Monday at 2:00 AM, writes the JSON payload to a lightweight Vercel Postgres or Redis instance, and the React component simply renders the cached database payload.`<h2>`

tags. In that case, you would need to swap Cheerio for a headless browser instance using `puppeteer-core`

.This Next.js architecture completely controls your data pipeline. It identifies the gap between your human traffic and machine-readability, audits the syntax exactly as Gemini would, and presents the workflow in a strictly professional interface.

Because GSC obscures AI Overview data, the only way to truly test your GEO strategy is to build a synthetic testing environment. You must use an LLM to read your live site and see what it concludes.

Instead of waiting for Google's crawlers, you build an automated pipeline that asks Gemini questions about your specific niche.

`0.0`

(to force strict, factual retrieval rather than creative generation).It is thrilling when this architecture finally clicks. We have covered how Google recognizes your existence (Knowledge Graph) and how it reads your syntax (Cloud Natural Language).

Now, we must measure the final output: **How does the machine actually answer a human question?**

You cannot wait for Google Search Console to slowly trickle in "AI Overview" data. To actively engineer your Generative Engine Optimization (GEO) strategy, you must build a synthetic testing environment. You do this by plugging directly into the **Gemini API** (or Vertex AI for enterprise endpoints) and turning on a specific feature: **Google Search Grounding.**

Here is the fine-grained breakdown of the data points exposed by the Gemini API and how to weaponize them for your dashboard.

When you send a prompt to the Gemini API with the `GoogleSearch`

tool enabled, you are not just asking an LLM to guess an answer. You are forcing the model to query the live Google Search index, extract factual chunks, and synthesize a cited response.

The API returns a standard text response, but hidden inside the JSON payload is a critical object called `groundingMetadata`

. This is the absolute goldmine for AEO.

Here are the specific data points exposed inside `groundingMetadata`

:

| Data Point | What it means | The GEO Value |
|---|---|---|
`webSearchQueries` |
An array of the exact search terms the LLM generated to fact-check your prompt. |
Query Expansion. If you ask Gemini "Best MiCA compliance tools," and its internal `webSearchQuery` is "Enterprise AI governance software EU," you instantly know the exact semantic entities the machine associates with your product category. |
`groundingChunks.web.uri` |
The exact URLs the LLM scraped to generate the answer. |
The Citation Leaderboard. This tells you definitively who the LLM trusts. If your URL is not in this array, your Entity Salience (from our previous step) is too low. |
`groundingChunks.web.title` |
The `<title>` tag of the cited webpage. |
Snippet Optimization. Proves exactly which page titles are enticing the RAG engine to extract data. |
`groundingSupports.segment` |
The exact sentence in the LLM's generated response that corresponds to a specific chunk. |
Factual Mapping. It mathematically maps which competitor's website is responsible for feeding which specific claim to the LLM. |

When you query the API, the metadata block looks exactly like this. This is the raw data your Next.js dashboard will parse:

```
"groundingMetadata": {
  "webSearchQueries": [
    "Kakunin MiCA AI compliance",
    "EU AI Act software solutions"
  ],
  "groundingChunks": [
    {
      "web": {
        "uri": "https://kakunin.io/docs/mica-framework",
        "title": "Automating MiCA Compliance | Kakunin Docs"
      }
    },
    {
      "web": {
        "uri": "https://techcrunch.com/2026/01/ai-regulation",
        "title": "How startups are navigating EU AI Rules"
      }
    }
  ],
  "groundingSupports": [
    {
      "segment": {
        "startIndex": 0,
        "endIndex": 85,
        "text": "Kakunin is an enterprise software platform that automates MiCA compliance for AI agents."
      },
      "groundingChunkIndices": [0]
    }
  ]
}
```

In traditional SEO, you use tools like Ahrefs to track your "Share of Voice" (how many keywords you rank for compared to competitors).

In the AEO era, you use the Gemini API to track your **"Share of Model"** (how often an LLM cites your architecture as the definitive source of truth).

Here is the exact enrichment play you build into your Next.js application:

`tools=[Tool(google_search=GoogleSearch())]`

is passed in the request.`groundingMetadata.groundingChunks.web.uri`

array.**The Result:** You now have a real-time, deterministic dashboard showing that out of 50 industry questions, Gemini cited Kakunin.io 14 times, cited Wikipedia 22 times, and cited your biggest competitor 31 times. You now know exactly where you stand in the machine's hierarchy of trust.

Testing with the Gemini API introduces a few strict architectural constraints that differ from standard web APIs:

`temperature`

setting that dictates creativity. If you set it to `0.0`

, the model becomes rigid and highly deterministic (ideal for strict RAG testing on your own internal documents). However, Google's 2026 documentation specifically states that when using `1.0`

for the algorithm to properly fan out and fetch live search results.`groundingChunks`

array does not mean the LLM actually said something positive about you. It just means it `sentiment.score`

to ensure the LLM isn't extracting your URL merely to criticize your pricing model.This completely closes the loop. You track what humans search (GSC), how your code runs (Cloud Monitoring), how the machine reads your syntax (Cloud NLP), and finally, how the machine regurgitates your facts (Gemini Grounding API).

This is the exact script you need to build your "Share of Model" tracking dashboard.

Google recently released their official `@google/genai`

SDK, which streamlines how we enable Google Search Grounding and extract the metadata payload.

Here is the complete Node.js script to run your automated prompt matrix, extract the machine's citations, and mathematically calculate your Share of Model against your competitors.

Initialize your project and install the official Google Gen AI SDK.

```
npm init -y
npm install @google/genai
```

GOOGLE_APPLICATION_CREDENTIALS="./gcp-service-account.json"

GEMINI_API_KEY="AIzaSy..."

Set your Gemini API key as an environment variable in your terminal:

```
export GEMINI_API_KEY="your_api_key_here"
```

Save this file as `gemini-tracker.js`

. This script runs a batch of questions, forces the LLM to search the web, extracts the URLs the machine trusts, and builds a citation leaderboard.

``` js
import { GoogleGenAI } from '@google/genai';

// Initialize the official Gemini SDK
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// The entities you want to track in the leaderboard
const TARGET_DOMAIN = "kakunin.io";
const COMPETITOR_DOMAINS = ["techcrunch.com", "ibm.com", "wikipedia.org"];

// Your Prompt Matrix (The questions your target audience asks)
const promptMatrix = [
  "What are the best software platforms for automating MiCA AI compliance?",
  "Compare enterprise AI governance tools for EU regulations.",
  "How do developers ensure data retention compliance under MiCA?"
];

async function runShareOfModelTracker() {
  console.log(`\n🚀 INITIATING GEMINI CITATION TRACKER`);
  console.log(`Tracking citations for ${promptMatrix.length} queries...\n`);

  // Initialize our leaderboard scoreboard
  const scoreboard = {
    [TARGET_DOMAIN]: 0,
    "Other/Competitors": 0
  };
  COMPETITOR_DOMAINS.forEach(domain => scoreboard[domain] = 0);

  // Iterate through the Prompt Matrix
  for (const [index, prompt] of promptMatrix.entries()) {
    console.log(`\n[Query ${index + 1}/${promptMatrix.length}]: "${prompt}"`);

    try {
      // Ping Gemini with Google Search Grounding enabled
      const response = await ai.models.generateContent({
        model: "gemini-3.5-flash", // Use the latest flash model for fast/cheap RAG extraction
        contents: prompt,
        config: {
          // Temperature 1.0 is required for optimal Google Search fanning
          temperature: 1.0, 
          // This is the trigger that turns on Answer Engine features
          tools: [{ googleSearch: {} }] 
        }
      });

      // Navigate the JSON payload to extract the Grounding Metadata
      const metadata = response.candidates[0]?.groundingMetadata;

      if (!metadata?.groundingChunks) {
        console.log("   ⚠️ No web citations found for this query.");
        continue;
      }

      // Log the internal queries Gemini generated to find the answer
      if (metadata?.webSearchQueries) {
        console.log(`   🔍 Internal LLM Searches: [${metadata.webSearchQueries.join(", ")}]`);
      }

      console.log(`   🔗 URLs Cited by Gemini:`);

      // Analyze every URL the LLM extracted a fact from
      for (const chunk of metadata.groundingChunks) {
        if (!chunk.web?.uri) continue;

        const citedUrl = chunk.web.uri;
        console.log(`      - ${citedUrl}`);

        // Update the Share of Model Scoreboard
        let matched = false;

        if (citedUrl.includes(TARGET_DOMAIN)) {
          scoreboard[TARGET_DOMAIN]++;
          matched = true;
        } else {
          for (const competitor of COMPETITOR_DOMAINS) {
            if (citedUrl.includes(competitor)) {
              scoreboard[competitor]++;
              matched = true;
              break;
            }
          }
        }

        // Group all other citations into the generic bucket
        if (!matched) {
          scoreboard["Other/Competitors"]++;
        }
      }

    } catch (error) {
      console.error(`   ❌ API Error on query:`, error.message);
    }
  }

  // Calculate and Print the Final Share of Model Leaderboard
  console.log(`\n======================================================`);
  console.log(`📊 FINAL "SHARE OF MODEL" LEADERBOARD`);
  console.log(`======================================================`);

  // Calculate total citations to generate percentages
  const totalCitations = Object.values(scoreboard).reduce((a, b) => a + b, 0);

  if (totalCitations === 0) {
    console.log("No citations extracted across the prompt matrix.");
    return;
  }

  // Sort the scoreboard from highest citations to lowest
  const sortedLeaderboard = Object.entries(scoreboard).sort((a, b) => b[1] - a[1]);

  sortedLeaderboard.forEach(([domain, count]) => {
    const percentage = ((count / totalCitations) * 100).toFixed(1);
    const label = domain === TARGET_DOMAIN ? `🎯 ${domain} (YOU)` : `   ${domain}`;
    console.log(`${label.padEnd(25)} | ${count} citations (${percentage}%)`);
  });
  console.log(`======================================================\n`);
}

// Execute the tracker
runShareOfModelTracker();
```

Run the script directly in your terminal:

```
node gemini-tracker.js
```

**The Console Output:**

When the script finishes, you will see a leaderboard that looks like this:

```
======================================================
📊 FINAL "SHARE OF MODEL" LEADERBOARD
======================================================
   wikipedia.org          | 12 citations (40.0%)
🎯 kakunin.io (YOU)       | 8 citations (26.6%)
   Other/Competitors      | 7 citations (23.3%)
   techcrunch.com         | 3 citations (10.0%)
   ibm.com                | 0 citations (0.0%)
======================================================
```

This script moves you from reactive SEO to proactive GEO.

`techcrunch.com`

is outscoring you on queries about your own proprietary features, you look at their page using the Cloud Natural Language API. You will likely find their Subject-Verb structures are tighter than yours. You update your landing page to match.`Internal LLM Searches`

printed in the console. If you prompt Gemini about "AI Compliance," but it natively translates that prompt into an internal search for "LLM bias mitigation frameworks," you now have the exact entity language you need to inject into your `<h2>`

tags.`scoreboard`

payload in a database every week and render a line chart showing your "Share of Model" growing over time as your optimization efforts compound.In the traditional dashboard we built earlier, the ultimate metric was `GA4 sessions`

. In the AEO/GEO era, you must prepare executives for the **Zero-Click Reality**.

If you optimize perfectly for Answer Engines, Google will extract your factual payload (e.g., Kakunin's pricing tiers) and display it directly in the AI Overview. The user gets their answer and never clicks your link.

Your GSC `impressions`

will skyrocket, but your GA4 `sessions`

will plummet. If you do not decouple "Brand Visibility" (Impressions + LLM Citations) from "Traffic Acquisition" (Clicks + Sessions) in your reporting architecture, perfect GEO execution will look like a catastrophic traffic failure on your dashboard.

By unifying client-side visibility data with server-side natural language audits and synthetic LLM simulations, developers can move from blindly chasing legacy keywords to systematically commanding their brand's narrative across the entire generative web ecosystem.