Zero Frontend Code: Deploying Interactive A2UI Agents Directly to Gemini Enterprise

wpnews.pro

Traditionally, application user interfaces are static and deterministic. As developers, we anticipate every possible user flow, hardcode the layout, map specific data fields to fixed UI components (like cards, tables, or forms), and deploy it.

If a user asks a complex question that requires a blend of different data sources, a traditional UI forces them to navigate multiple tabs or forces the developer to write custom code for that specific layout.

*Ex of Traditional Restaurant Search :*Imagine you want to find a restaurant for a team dinner. You need to see:

In a traditional application, you have to click a ** search** button to see a rigid grid of cards. If you then say,

Generative UI (GenUI) is a paradigm where the user interface is generated, assembled, or customized ** at runtime by an AI agent** based on specific context of the user’s request. Instead of rendering static data into a hardcoded layout, the LLM determines

Ex of Generative UI Restaurant Search:

Autonomy given to the AI agent to construct the user interface exists along a spectrum, moving from highly predictable to completely fluid.

In this model, developers maintain full control over the visual presentation and design system. Developers build a predefined library of rich UI components (Ex *ShowRestaurantCard, **ReservationForm, *MapView). AI agent's job is simply to execute the backend tool, get the structured data, and pick the single best component from the library to display it.

Ex: You ask to book a table at “Pasta Paradiso”.

Agent calls booking API, receives a JSON payload, and explicitly selects ReservationForm component, passing restaurant ID and available times as props. Agent cannot modify the form's layout; it can only choose to show it.

Here, developers build smaller, highly modular, lego-like building blocks (atomic components like Container, Button, Text, Image, Grid). Instead of selecting a monolithic component, agent receives a catalog of these atomic building blocks and generates a declarative layout schema (often JSON-based) at runtime to assemble a custom component on the fly.

Ex: You ask for a quick summary of a restaurant’s vibe.

Agent decides to build a custom dashboard. It writes a declarative UI schema that puts an Image on the left, a VerticalStack of Text elements (for reviews) on the right, and drops a custom action Button labeled "Get Directions" at the bottom. The specific layout didn't exist until the agent declared it

This is the most autonomous layer. The agent is not restricted to pre-built components or schemas. It can either embed sandboxed third-party widgets dynamically or write raw UI code (Ex: React/HTML/Tailwind) on the fly, which is then rendered inside a safe canvas or sandbox in the application.

Ex: You say, “I have a budget of $500 for 10 people. Plot a simulation of how our total cost changes if we add cocktails vs. appetizers at these three restaurants.” Agent writes a custom JavaScript/Recharts snippet at runtime to render a fully interactive, custom simulation chart directly in your chat interface.

How GenUI embeds itself into end-user experience dictates the application surface.

UI components are injected directly into a chronological, linear conversational stream. The components behave like rich, interactive text messages.

Ex: In a standard chat interface, you ask for a menu.

Agent replies with text, and right underneath the text, an interactive, swipeable product carousel of dishes appears directly inside the chat bubble stream

A split-screen or multi-panel interface. The left side typically handles the conversation (the text/intent gathering), while the right side is a dedicated, persistent canvas where the Generative UI is rendered and mutated.

Ex:You are planning a food tour. You talk to the agent in the left panel. On the right panel, the agent dynamically generates and continuously updates a custom interactive map itinerary and timeline workspace. As you talk, the canvas on the right updates cleanly without cluttering your chat history.

There is no visible chat box or prompt input field. The user interacts with traditional UI inputs (buttons, toggles, standard search bars), but the application backend uses an agent to dynamically synthesize and render the presentation layer on the fly.

Ex: You open the *Restaurant Finder app *and click a single button: “Surprise me for lunch.”

The app runs an agentic workflow in the background considering your location, historical preferences, and time of day, and completely constructs a tailored home page layout specifically for that moment.

When implementing GenUI with frameworks like A2UI and GCP ADK, execution generally falls into these three archetypes

In this approach, the agent has zero knowledge of UI rendering. The agent simply executes a tool, fetches structured data (JSON), and returns it. The frontend codebase owns the layout and maps the data to a pre-built component.Backend — ADK Tool Definition

from google.cloud import agent_development_kit as adk# The agent simply provides structured data@adk.tooldef get_restaurant_details(restaurant_id: str) -> dict:    """Fetches details for a specific restaurant to display to the user."""    # Mocking database response    return {        "id": restaurant_id,        "name": "Pasta Paradiso",        "cuisine": "Italian",        "rating": 4.8,        "address": "123 Milan Way"    }

Frontend — HTML / Client-Side RenderingFrontend receives the tool output, identifies the context, and passes the raw data into a predefined, hardcoded template or component.

<div id="ui-container">  <template id="restaurant-card-template">    <div class="card border rounded-lg p-4 shadow">      <h3 class="text-xl font-bold text-italian-red" id="res-name"></h3>      <p class="text-gray-600" id="res-cuisine"></p>      <div class="text-yellow-500" id="res-rating"></div>      <p class="text-sm text-gray-500" id="res-address"></p>    </div>  </template></div><script>  // Frontend logic that listens to the ADK agent's tool response  function renderStaticUI(toolOutput) {    const template = document.getElementById('restaurant-card-template').content.cloneNode(true);        template.getElementById('res-name').textContent = toolOutput.name;    template.getElementById('res-cuisine').textContent = toolOutput.cuisine + " Cuisine";    template.getElementById('res-rating').textContent = "⭐ " + toolOutput.rating;    template.getElementById('res-address').textContent = toolOutput.address;        document.getElementById('ui-container').appendChild(template);  }</script>

Here, agent chooses how to structure the layout. It uses a standardized JSON UI schema (like A2UI design tokens or Adaptive Cards structural primitives) to compose the interface layout dynamically based on the user’s specific context.Backend — ADK Tool returning UI Schema

from google.cloud import agent_development_kit as adk@adk.tooldef generate_restaurant_summary_layout(restaurant_id: str) -> dict:    """Generates a tailored UI composition schema for a restaurant summary."""    # Agent dynamically constructs a layout tree using atomic blocks    return {        "componentType": "FlexContainer",        "properties": {"direction": "column", "gap": "12px", "padding": "16px"},        "children": [            {                "componentType": "Heading",                "properties": {"text": "Pasta Paradiso", "level": 2}            },            {                "componentType": "BadgeList",                "properties": {"items": ["Italian", "Top Rated", "Romantic"]}            },            {                "componentType": "ActionButton",                "properties": {"label": "Book Table Now", "actionUrl": f"/book/{restaurant_id}"}            }        ]    }

Frontend — HTML / A2UI Engine Parser

Frontend doesn’t know what components are arriving; it simply hosts a generic runtime engine (a2ui-renderer) that reads the atomic schema items and renders their corresponding HTML blocks dynamically.

<div id="dynamic-canvas"></div><script>  // A2UI-style component mapper engine  const componentRegistry = {    FlexContainer: (props, childrenHtml) => `<div class="flex flex-${props.direction} gap-3 p-4 border rounded" style="gap: ${props.gap}">${childrenHtml}</div>`,    Heading: (props) => `<h${props.level} class="text-2xl font-extrabold">${props.text}</h${props.level}>`,    BadgeList: (props) => `<div class="flex gap-1">${props.items.map(i => `<span class="bg-blue-100 text-blue-800 text-xs px-2 py-1 rounded">${i}</span>`).join('')}</div>`,    ActionButton: (props) => `<button class="bg-blue-600 text-white font-medium py-2 px-4 rounded" onclick="location.href='${props.actionUrl}'">${props.label}</button>`  };  function renderDeclarativeUI(schema) {    const canvas = document.getElementById('dynamic-canvas');    canvas.innerHTML = parseSchemaNode(schema);  }  function parseSchemaNode(node) {    const builder = componentRegistry[node.componentType];    if (!builder) return '';        const childrenHtml = node.children ? node.children.map(parseSchemaNode).join('') : '';    return builder(node.properties, childrenHtml);  }</script>

In this scenario, the agent has absolute creative freedom. It writes native HTML/CSS/JS or framework source code on the fly. The application renders this code within a secure container (like a sandboxed iframe or shadow DOM context).Backend — ADK Agent Native Generation

Frontend — HTML / Sandboxed Target Container

The client receives raw, unvetted executable code string from the LLM agent and must isolate it carefully to protect system security while providing a sandbox layer for execution.

<div class="workspace-panel">  <div id="sandbox-container" class="w-full h-auto"></div></div><script>  function renderOpenEndedUI(rawHtmlString) {    const container = document.getElementById('sandbox-container');    container.innerHTML = ''; // Clear prior view    // Utilizing an iframe sandbox or shadow root to evaluate the agent's code securely    const iframe = document.createElement('iframe');    iframe.sandbox = "allow-scripts"; // Only allow execution inside its own closure    iframe.style.width = "100%";    iframe.style.border = "none";        container.appendChild(iframe);    // Inject Tailwind CSS into the iframe context alongside the agent generated string    const targetDoc = iframe.contentWindow.document;    targetDoc.open();    targetDoc.write(`      <head>        <script src="https://cdn.jsdelivr.net/npm/@tailwindcss/browser@4"></script>      </head>      <body class="bg-transparent m-0 p-2">        ${rawHtmlString}      </body>    `);    targetDoc.close();  }</script>

The landscape of Generative UI uses specific open-source specifications, runtime protocols, and orchestration ecosystems to create dynamic experiences.

Let’s demystify A2UI, AG-UI, MCP-UI / MCP Apps, and Open-JSON-UI, clarifying how they relate and when to deploy them.

Clarifying the Confusion: A2UI vs. AG-UI Because they both have

Think of A2UI as the blueprint (the data specification) and AG-UI as the construction transport truck and plumbing (the runtime environment).

As Generative UI matures, different standardization paths have emerged based on who controls the component rendering engine.

Choosing the right combination depends on where your agent’s tools live and how tightly you need to lock down your application’s user experience.

Scenario A: Build with AG-UI + A2UI / Open-JSON-UI

Scenario B: Build with MCP Apps / MCP-UI

A2A (Agent-to-Agent Protocol) is a secure, open-source communication and messaging layer managed under the Linux Foundation. It defines how autonomous AI agents discover each other, pass credentials, collaborate, and exchange tasks across different frameworks or cloud infrastructures.

While A2UI dictates what the visual interface components and application state look like (via JSON), A2A handles the how — acting as the secure network pipeline carrying those conversational and data blocks between multiple remote agents and the client application.

Think of them as two parts of an automated postal system:

How A2A and A2UI Work Together In an enterprise configuration of our

Here is how the A2A messaging protocol acts as the vehicle for A2UI components when a user types: “Find an open table for 2 at a top romantic spot.”

Step 1: Initial Ingestion & Discovery User prompts the central orchestrator (the

Step 2: Cross-Agent A2A Secure Handshake Concierge agent launches an asynchronous connection task to a specialized external vendor tool —

{  "jsonrpc": "2.0",  "method": "tasks/stream",  "params": {    "taskId": "task_find_romantic_99",    "prompt": "Find top romantic spots matching location context",    "securityToken": "Bearer oauth_a2a_token_xyz"  }}

Step 3: Carrying the A2UI Design Block Discovery Agent processes the query, finds

{  "a2a_message_id": "msg_004",  "parts": [    {      "mimeType": "application/vnd.a2ui.surfaceUpdate+json",      "body": {        "surfaceId": "results-canvas",        "components": [          { "id": "card-1", "type": "RestaurantCard", "props": { "title": "Pasta Paradiso", "rating": 4.9, "actionId": "request_booking" } }        ]      }    }  ]}

Step 4: Routing Action Events Back (Bidirectional Loop) Client app receives this payload and paints the custom Lit layout container onto the user’s screen.

When the user clicks the card’s native confirm option, the client app creates an A2UI userAction payload. The local orchestrator encapsulates that event execution block inside an A2A message request and ships it to yet another specialized machine endpoint—the Table Booking Engine Agent—to securely fulfill the transactional reservation loop.

A2UI (Agent to UI) is a declarative UI protocol for agent-driven interfaces.** **A2UI protocol solves a core architectural challenge: How can an AI agent safely render custom, highly interactive user interfaces inside a client application without executing dangerous, arbitrary code (like raw HTML or React)?

A2UI approaches this by defining a strict, native-first declarative format. The agent sends structure and data over the wire via flat JSON, while the client application maintains full control over the design system, styling, and security.

Let’s break down the foundational concepts and data flows of A2UI

Let’s break down these core architecture segments and about the lifecycle mappings.

This represents the actual execution lifecycle of a single user request moving from the browser, through the network pipelines, into the LLM, and back.

Ex: “Show me restaurants near me”

Let’s look at how the data moves through the architecture during a reservation lifecycle. Ex:** “Book a table for 2 at Pasta Paradiso”.**

User types: "Book a table for 2 tomorrow at Pasta Paradiso" inside the application window.

Frontend wrapper captures the text stream. Along with this prompt, the client sends a baseline metadata block detailing its A2UI Version capabilities and available Catalog specifications. This ensures that the agent backend doesn’t output components that the client doesn’t know how to build.

Python-based GCP ADK backend processes the request via Gemini. Gemini recognizes that booking a table requires user configurations (date, time selection) and returns a structural layout intent rather than plain markdown text.

The server signals that it wants to create a dedicated interactive space by emitting an initial initialization packet. In the A2UI v0.9 spec, this is wrapped as a createSurface (or beginRendering) message type:

{  "version": "1.0",  "createSurface": {    "surfaceId": "booking-surface-45",    "catalogId": "basic-restaurant-ui",    "layout": "modal"  }}

Instead of forcing the client to wait for a massive JSON layout tree block, A2UI leverages resilient streaming. The agent emits a surfaceUpdate message that delivers a flat block array mapping structural element relationships alongside an independent state payload (dataModelUpdate).

{  "version": "1.0",  "surfaceUpdate": {    "surfaceId": "booking-surface-45",    "components": [      {        "id": "root-container",        "type": "Column",        "props": { "gap": "16px" }      },      {        "id": "date-picker-node",        "type": "DateTimeInput",        "props": {          "label": "Select Booking Date",          "value": { "path": "/reservation/date" }        }      },      {        "id": "submit-booking-btn",        "type": "Button",        "props": {          "label": "Confirm Reservation",          "actionId": "trigger_reservation_execution"        }      }    ]  }}

Notice the separation of concerns:DateTimeInput component doesn't hardcode a text value. Its value attribute points directly to a reactive path target:{ "path": "/reservation/date" }.

Simultaneously, the agent seeds the initialization state of that form via a discrete state update package:

{  "version": "1.0",  "dataModelUpdate": {    "surfaceId": "booking-surface-45",    "path": "/reservation",    "values": {      "date": "2026-06-14T19:00:00Z",      "partySize": 2    }  }}

Frontend client’s A2UI Web-Core Parsing Engine intercepts this stream.

The architecture is inherently bidirectional. If the user notices tomorrow is a Monday and clicks the input box to switch the date to next Friday instead, the client library updates the state model.

When the user finally clicks the “Confirm Reservation” button, the client blocks local interactions and fires a strict userAction structured event back to the GCP ADK server over the transport pipeline (WebSockets / SSE):

{  "version": "1.0",  "userAction": {    "surfaceId": "booking-surface-45",    "actionId": "trigger_reservation_execution",    "context": {      "currentDataModel": {        "reservation": {          "date": "2026-06-19T19:30:00Z",          "partySize": 2        }      }    }  }}

The backend agent receives this data payload, executes the booking API call successfully, and resolves the loop. To clean up the layout environment, it emits a final optimization event instructing the frontend to clean up the workspace canvas

{  "version": "1.0",  "deleteSurface": {    "surfaceId": "booking-surface-45"  }}

The modal dismisses itself instantly, and the user receives a final, text-based confirmation card detailing their success. No unverified third-party scripts ever touched the layout canvas.

The underlying core architecture of A2UI relies on a distinct rule: Disconnect the Generation of UI from the Execution of UI.

Why this matters for security:An attacker cannot execute a Cross-Site Scripting (XSS) payload via the LLM because the agent cannot send raw<script> tags. The agent canonlypass variable strings to pre-compiled properties that the client application has already registered and vetted.

Here is a practical breakdown of how you build an application using the A2UI Request Lifecycle steps from the document:

You author a master config manifest (catalog.json) mapping out every visual element your application supports, establishing its expected type schemas.

To train the LLM, you write concrete mock example snippets (e.g.,restaruants_list.json). This shows the system exact snapshots of what a correct, fully formatted A2UI structure response block looks like.

In your backend Python agent file, you import the A2UI SDK tools and instantiate an orchestration manager component to load your configuration rules:

from a2ui.core.schema.manager import A2uiSchemaManager, CatalogConfigschema_manager = A2uiSchemaManager(    catalogs=[CatalogConfig.from_path(name="restaruant_ui", catalog_path="catalog.json", examples_path="examples")])

You invoke a built-in A2UI script method to generate your instructions. The SDK reads your schema structures and example files, automatically compiling a complex system instruction block that forces LLM to output valid A2UI JSON structures:

system_instruction = schema_manager.generate_system_prompt(    role_description="You are a helpful local restaurants locator assistant.",    include_schema=True,    include_examples=True)

You hand that generated system instruction block over to your core GCP ADK LLM Agent instance configuration:

from google.adk.agents.llm_agent import LlmAgentrestaurant_agent = LlmAgent(    model="gemini-2.5-flash",    instruction=system_instruction,    tools=[find_nearest_restaurants])

You wire up your server-side response interceptor utilities (like parse_response_to_parts) inside your Cloud Run execution thread. This utility reads raw outputs from Gemini, checks them using a validator to catch any hallucinated fields, and encapsulates them into A2A message blocks.

Finally, on your frontend web app workspace codebase, you create your matching visual layout structures using Lit, Angular, or Flutter web components. You map their tags to your local widget engine, matching the exact naming definitions declared in Step 1.

Let’s look at the lifecycle of a user booking a table, detailing how the A2UI Agent and A2UI Renderer communicate step by step.

The backend agent initiates a stream using Server-Sent Events (SSE). It streams down the structural component blueprint and the initial data state.

// Packet 1: Define Components{  "type": "surfaceUpdate",  "components": [    { "id": "root", "type": "VerticalLayout" },    { "id": "title", "type": "Text", "parentId": "root", "props": { "variant": "h2" } },    { "id": "seats-input", "type": "Counter", "parentId": "root", "props": { "label": "Guests", "valuePath": "/booking/guests" } },    { "id": "submit-btn", "type": "Button", "parentId": "root", "props": { "label": "Confirm", "actionId": "submit_clicked" } }  ]}// Packet 2: Define Initial Data Model State{  "type": "dataModelUpdate",  "values": {    "/titleText": "Book at Pasta Paradiso",    "/booking/guests": 2  }}

A2UI Renderer client interceptor reads these incoming stream packets. It stores the component metadata rules locally in memory and initializes a reactive state manager for the path coordinates (/titleText and /booking/guests).

Server emits an explicit execution flag (beginRendering). This instructs the client that the structural configuration is stable and ready to display.

{ "type": "beginRendering", "surfaceId": "panel-12" }

A2UI Renderer performs an adjacency-list tree walk starting at the "id": "root" element.

User opens the counter on their screen and clicks the “+” button to change the headcount from 2 to 4.

Because of Data Binding, the client application instantly mutates the local state path /booking/guests to 4. The counter instantly renders the number 4 on-screen natively, without needing a slow roundtrip request to the backend database.

When the user clicks the “Confirm” button, the renderer blocks the form and bundles the interaction context into a structured userAction payload. It transmits this payload back to the server over a separate HTTP POST/WebSocket communication path (termed Agent-to-Agent/A2A Message):

{  "type": "userAction",  "actionId": "submit_clicked",  "surfaceId": "panel-12",  "stateSnapshot": {    "booking": { "guests": 4 }  }}

The backend GCP ADK Agent processes this action, contacts the restaurant database to verify an open table for 4, and pushes a brand-new surfaceUpdate package down the original, open SSE stream channel:

{  "type": "surfaceUpdate",  "components": [    { "id": "success-card", "type": "Alert", "props": { "status": "success", "text": "Reservation Confirmed for 4 guests!" } }  ]}

The client receives this modification, clears out the old input form, and seamlessly transitions the visual space to display the success alert block.

To fully understand why A2UI is built this way, you need to understand three core pillars of its design philosophy:

Traditional web UIs are deeply nested tree structures (HTML DOM). For an LLM, generating nested JSON structures (with matching closing brackets three layers deep) is incredibly error-prone and frequently results in broken, unparseable layouts.

A2UI solves this by forcing the LLM to output an Adjacency List (a completely flat array of items that reference each other by ID string values).

// LLM outputs a simple, flat sequence. Easy to stream token-by-token![  { "id": "my-card", "component": "Card", "props": { "child": "layout-stack" } },  { "id": "layout-stack", "component": "Column", "props": { "children": ["txt-1", "btn-1"] } },  { "id": "txt-1", "component": "Text", "props": { "value": "Confirm Reservation?" } },  { "id": "btn-1", "component": "Button", "props": { "label": "Book Now" } }]

The client-side rendering engine takes this flat list and maps the relationships together into a visual UI hierarchy instantly.

A2UI separates the visual wireframe layout from the raw data variables. The agent sends structural frames (surfaceUpdate) once, and handles changes via targeted data model updates (dataModelUpdate).

If a user types into a form field or a price changes dynamically, the backend doesn’t need to rebuild or resend the whole UI structure over the network. It simply transmits a single JSON pointer patch update:

{  "type": "dataModelUpdate",  "surfaceId": "reservation-screen",  "contents": [    { "key": "/reservation/time", "valueString": "7:30 PM" }  ]}

Backend agent is completely headless. It knows absolutely nothing about CSS, browser window viewports, tailwind styles, or operating systems. It only emits structural intents. This provides two massive architectural advantages:

Let’s build a minimal, concrete implementation of our Restaurant Finder application using GCP ADK on the backend and A2UI with a Lit-based frontend renderer.

To keep this illustration clear, we will focus strictly on the files and interfaces required to wire up the A2UI loop: displaying a restaurant card and handling a table booking.

Project Architecture Here are the key files we will build:

**Step 1: Define the Design Contract (**catalog.json) First, we define a catalog. This file tells both the backend agent and the frontend renderer exactly what components exist, what properties they accept, and where their data bindings live.

{  "catalogId": "restaurant-finder-v1",  "components": {    "VerticalStack": {      "description": "A layout container that stacks children vertically.",      "properties": { "gap": { "type": "string" } }    },    "RestaurantHeader": {      "description": "Displays the restaurant name and cuisine type.",      "properties": {        "name": { "type": "string" },        "cuisine": { "type": "string" }      }    },    "BookingForm": {      "description": "An interactive area to pick guest count and submit.",      "properties": {        "valuePath": { "type": "string", "description": "JSON pointer path for guest count state" },        "actionId": { "type": "string" }      }    }  }}

**Step 2: Write the Backend Agent (**agent.py) Using

from google.cloud import agent_development_kit as adkimport jsonagent = adk.Agent(name="RestaurantFinderAgent")@agent.actiondef show_restaurant_details(session_id: str, restaurant_id: str):    """Invoked when a user selects a restaurant. Streams the A2UI layout."""        # 1. Establish the visual canvas workspace (Surface)    adk.stream_message(session_id, {        "type": "createSurface",        "surfaceId": "details-panel",        "catalogId": "restaurant-finder-v1",        "layout": "sidebar"    })        # 2. Stream the Component Tree structure (Adjacency List)    adk.stream_message(session_id, {        "type": "surfaceUpdate",        "surfaceId": "details-panel",        "components": [            { "id": "root", "type": "VerticalStack", "props": { "gap": "16px" } },            {                 "id": "header-node",                 "type": "RestaurantHeader",                 "parentId": "root",                 "props": { "name": "Pasta Paradiso", "cuisine": "Italian" }             },            {                 "id": "booking-node",                 "type": "BookingForm",                 "parentId": "root",                 "props": { "valuePath": "/booking/partySize", "actionId": "submit_reservation" }             }        ]    })        # 3. Initialize the isolated application state (Data Model)    adk.stream_message(session_id, {        "type": "dataModelUpdate",        "surfaceId": "details-panel",        "values": {            "/booking/partySize": 2        }    })        # 4. Signal the client renderer to paint the compiled data    adk.stream_message(session_id, {        "type": "beginRendering",        "surfaceId": "details-panel"    })@agent.on_user_action("submit_reservation")def handle_booking(session_id: str, payload: dict):    """Listens for the bidirectional event fired when the user clicks submit."""    # Read the current bound value from the state snapshot sent by the client    party_size = payload["stateSnapshot"]["booking"]["partySize"]        # (In a real app, you would execute an external Booking API call here)        # Stream a quick confirmation update back down the original line    adk.stream_message(session_id, {        "type": "dataModelUpdate",        "surfaceId": "details-panel",        "values": {            "/booking/statusMessage": f"Success! Table booked for {party_size} guests."        }    })

**Step 3: Create the Native Frontend Components (**restaurant-ui.ts) On the client app, we use

import { LitElement, html, css } from 'lit';import { customElement, property } from 'lit/decorators.js';// 1. The Heading Component@customElement('restaurant-header')export class RestaurantHeader extends LitElement {  @property({ type: String }) name = '';  @property({ type: String }) cuisine = '';  render() {    return html`      <div class="header">        <h2>${this.name}</h2>        <span class="badge">${this.cuisine}</span>      </div>    `;  }}// 2. The Interactive Booking Form Component@customElement('booking-form')export class BookingForm extends LitElement {  @property({ type: Number }) partySize = 2;  @property({ type: String }) status = '';    // Callback hooks provided by the orchestrator engine  onValueChange: (newValue: number) => void = () => {};  onActionTriggered: () => void = () => {};  render() {    return html`      <div class="booking-box">        <label>Number of Guests:</label>        <div class="counter">          <button @click="${() => this.updateSize(-1)}">-</button>          <span>${this.partySize}</span>          <button @click="${() => this.updateSize(1)}">+</button>        </div>        <button class="submit" @click="${this.onActionTriggered}">Confirm Table</button>        ${this.status ? html`<p class="status">${this.status}</p>` : ''}      </div>    `;  }  private updateSize(delta: number) {    this.partySize = Math.max(1, this.partySize + delta);    this.onValueChange(this.partySize); // Update local A2UI data model state instantly  }}

**Step 4: A2UI Orchestration Core (**renderer.ts) This is the glue. The core renderer file receives the streaming JSON packet sequence from our Python agent, tracks the local data model variables, walks the layout tree, and instances our Lit nodes into the viewport.

import { RestaurantHeader, BookingForm } from './restaurant-ui';class A2UIRenderEngine {  private currentComponents: any[] = [];  private dataModel: Record<string, any> = {};  private containerElement: HTMLElement;  constructor(targetCanvas: HTMLElement) {    this.containerElement = targetCanvas;  }  // Processes incoming pipeline events sent by the server  public handleIncomingMessage(msg: any) {    switch (msg.type) {      case 'surfaceUpdate':        this.currentComponents = msg.components;        break;      case 'dataModelUpdate':        // Merge updates into our flat path dictionary        this.dataModel = { ...this.dataModel, ...msg.values };        this.updateDataBindings(); // Real-time reactive re-binding        break;      case 'beginRendering':        this.renderLayoutTree();        break;    }  }  // Loops through the component primitives list and compiles HTML  private renderLayoutTree() {    this.containerElement.innerHTML = ''; // Clear canvas        // For simplicity, grab the children of our root layout stack    const rootChildren = this.currentComponents.filter(c => c.parentId === 'root');    rootChildren.forEach(node => {      if (node.type === 'RestaurantHeader') {        const el = document.createElement('restaurant-header') as RestaurantHeader;        el.name = node.props.name;        el.cuisine = node.props.cuisine;        this.containerElement.appendChild(el);      }             else if (node.type === 'BookingForm') {        const el = document.createElement('booking-form') as BookingForm;        const valuePath = node.props.valuePath; // e.g., "/booking/partySize"                // Connect the dynamic data model paths to the element properties        el.partySize = this.dataModel[valuePath] || 2;        el.status = this.dataModel['/booking/statusMessage'] || '';        // Bind data changes happening on the screen back to local data model state        el.onValueChange = (newVal) => {          this.dataModel[valuePath] = newVal;        };        // Bind UI action button handlers back to the A2A server callback line        el.onActionTriggered = () => {          this.sendAgentAction(node.props.actionId);        };        this.containerElement.appendChild(el);      }    });  }  private updateDataBindings() {    // If layout is already drawn, update properties dynamically    this.renderLayoutTree();  }  private sendAgentAction(actionId: string) {    const userActionPayload = {      type: "userAction",      actionId: actionId,      surfaceId: "details-panel",      stateSnapshot: {        booking: {          partySize: this.dataModel["/booking/partySize"]        }      }    };        console.log("Sending A2A Message to GCP Backend Agent:", userActionPayload);    // Code mapping to clear an HTTP POST back to agent.py would execute here  }}

How to trace this setup in your head:

To introduce a brand new, custom “Restaurant Card” UI component into your system, you need to update 4 precise layers across the A2UI contract ecosystem.

Here is the step-by-step implementation guide to wire it up from backend to frontend.

**Step 1: Update the Contract Manifest (**catalog.json) First, you must register the component in the catalog shared between the agent and the renderer. This ensures the LLM knows what properties (props) it is allowed to configure when outputting the layout data.

{  "components": {    "RestaurantCard": {      "description": "A rich card visualization for a single restaurant choice.",      "properties": {        "title": { "type": "string" },        "rating": { "type": "number" },        "imagePath": { "type": "string", "description": "JSON pointer path to the image asset url" },        "actionId": { "type": "string" }      }    }  }}

**Step 2: Update the Native Lit Component (**restaurant-ui.ts) Next, build or update your physical frontend design asset using Lit. This file consumes the structured data properties and handles user click events natively.

import { LitElement, html, css } from 'lit';import { customElement, property } from 'lit/decorators.js';@customElement('restaurant-card')export class RestaurantCard extends LitElement {  @property({ type: String }) title = '';  @property({ type: Number }) rating = 0.0;  @property({ type: String }) imageUrl = '';    // Callback injected by the orchestrator engine  onCardSelected: () => void = () => {};  render() {    return html`      <div class="card" @click="${this.onCardSelected}">        <img src="${this.imageUrl}" alt="${this.title}" />        <div class="card-body">          <h3>${this.title}</h3>          <span class="star-rating">⭐ ${this.rating}</span>        </div>      </div>    `;  }}

**Step 3: Map the Token to the Core Parser (**renderer.ts) Update your client-side A2UI orchestration engine to parse the token string "RestaurantCard", read its variable configurations out of the reactive data model bindings, and compile the native Lit element.

// Inside your A2UIRenderEngine class layout walk loop:if (node.type === 'RestaurantCard') {  const el = document.createElement('restaurant-card') as RestaurantCard;    // Bind simple static primitives directly from props  el.title = node.props.title;  el.rating = node.props.rating;    // Resolve data-bound attributes from the live local state dictionary  const dynamicImagePath = node.props.imagePath; // e.g., "/restaurant/heroImage"  el.imageUrl = this.dataModel[dynamicImagePath] || 'placeholder.png';// Bind the interactive event back to the A2A communication line  el.onCardSelected = () => {    this.sendAgentAction(node.props.actionId);  };  this.containerElement.appendChild(el);}

**Step 4: Author the Backend Agent Logic (**agent.py) Finally, update your GCP ADK server-side code to instruct the Gemini agent to build the layout tree using your newly registered component node type.

from google.cloud import agent_development_kit as adk@agent.actiondef recommend_top_spot(session_id: str):    """Invoked by Gemini when providing a curated restaurant choice."""        # 1. State/Data Injection    adk.stream_message(session_id, {        "type": "dataModelUpdate",        "surfaceId": "main-canvas",        "values": {            "/restaurant/heroImage": "https://assets.example.com/pasta.jpg"        }    })        # 2. Component Layout Structural Injection    adk.stream_message(session_id, {        "type": "surfaceUpdate",        "surfaceId": "main-canvas",        "components": [            {                "id": "featured-card",                "type": "RestaurantCard",                "parentId": "root-layout",                "props": {                    "title": "Pasta Paradiso",                    "rating": 4.9,                    "imagePath": "/restaurant/heroImage",                    "actionId": "view_restaurant_details"                }            }        ]    })        # 3. Fire Render Trigger    adk.stream_message(session_id, { "type": "beginRendering", "surfaceId": "main-canvas"

Instead of waiting for an entire UI layout to generate, the agent streams structural blocks first, followed by real-time data fragments (like text or arrays) that fill in the view incrementally.

Ex: You ask for a detailed summary of Pasta Paradiso. The layout shell (the card frame and image container) pops up instantly. Then, the detailed description text fields stream in word-by-word right inside the card, exactly like a standard ChatGPT text response but constrained within a beautiful native UI container.

// Packet 1: Immediate Layout Shell Injection{  "type": "surfaceUpdate",  "components": [{ "id": "summary-card", "type": "Card" }]}// Packet 2 & 3: Streaming text tokens into the specific property path{ "type": "dataModelUpdate", "values": { "/card/desc": "An elegant " } }{ "type": "dataModelUpdate", "values": { "/card/desc": "An elegant Italian bistro..." } }

When a user updates a form, the client app updates the local data model instantly (optimistic update) so the UI feels responsive, while seamlessly syncing that change back to the backend agent in the background.

Ex: You are choosing a reservation time slot. When you select “7:30 PM”, the button immediately highlights in your browser so the application feels instant. In the background, an A2A message updates the server-side ADK state. If the server discovers that the slot was booked a second ago by someone else, it sends a patch packet to revert the selection and display an error.

This enforces a strict visual contract. The agent can only manipulate structural components and predefined semantic tokens (like colors or spacing keys). It cannot choose arbitrary fonts, hex codes, or layouts that violate your brand’s style guide.

Ex: The agent wants to emphasize a “Trending №1” badge for a popular restaurant. Instead of outputting style elements like style="color: #FF0000; font-size: 24px;", the agent must output standard catalog tokens:

{  "type": "Badge",  "props": {    "text": "Trending No. 1",    "intent": "danger",       // Maps to your brand's exact red palette variable    "size": "sm"               // Maps to your brand's strict padding rules  }}

This manages complex state lifecycles where user interactions dynamically morph or replace the current canvas view altogether, guiding them through sequential steps.

Ex: A complete checkout/booking pipeline.

Because layout properties are generated by an LLM, the client-side A2UI engine strictly cross-references and validates every incoming JSON property against a rigid schema before compiling it, stripping away any hidden malicious scripts.

Ex: A malicious user leaves a review on a restaurant containing an injection attack script disguised as HTML layout code. When the agent fetches this review and drops it into a component’s props, the client-side parser checks the catalog.json rules, recognizes the text field contains invalid executable code scripts, and safely renders it as safe plain text rather than compiling it as active code.

Performing validation on an **A2UI-enabled Restaurant Finder **application requires a strict process. Because generative UIs dynamically shift layouts based on LLM decisions, you must validate two things before hitting production:

Here is the step-by-step validation guide utilizing the built-in Google ADK Evaluation (ADK Eval) framework and A2UI schema tools.

Step 1: Schema Contract Checking (Unit-Level Validation) Before talking to an LLM, validate that your declarative catalog.json and agent mock payloads strictly match A2UI specification boundaries.

A2UI SDK uses precise component properties to enforce catalog compliance. For example, child-parent structural linkages cannot be random string properties; they must utilize predefined reference IDs.

What this catches:It flags if your agent attempts to use unauthorized primitives or binds interactive elements to invalid data model pointer paths (e.g., mapping a counter to an absolute string rather than a valid/booking/partySize JSON pointer).

Step 2: Establish Your “Golden Interaction” Dataset To test your application dynamically, you need an answer key. In the Google ADK framework, this is called a

adk web ./restaurant_agent_folder

Step 3: Configure the Validation Run Rules Now, create an evaluation configuration file (test_config.json) to define what rules constitute a "Pass" versus a "Fail." For a Generative UI app, you want to set metrics for structural trajectory accuracy and final semantic matching.

Create your validation parameters file:

{  "test_set": "restaurant_flow_v1.evalset.json",  "criteria": {    "tool_trajectory_score": {      "match_type": "EXACT",      "threshold": 1.0    },    "final_response_match_v2": {      "threshold": 0.85    }  }}

Step 4: Execute Programmatic Validation (CI/CD Automated Run) To protect your production pipeline from regressions when prompts or catalog features are updated, run your test assertions programmatically using pytest.

Run the validation suite via your terminal window:

pytest test_agent_validation.py -v

Step 5: Analyze Failures via Trace View If an evaluation fails, hover over the failure flag in your test dashboard or open your debugger console to pinpoint exactly where the mismatch occurred.

Once all programmatic scores output 1.0 (or stay safely above your configured passing thresholds), your application is successfully validated and ready to deploy to production.

Once the evaluation scores pass your quality constraints, choose one of the following deployment paths.

This is the standard enterprise path on the Gemini Enterprise Agent Platform. It provides a fully managed, auto-scaling runtime specifically designed for ADK agents. You do not manage Dockerfiles, servers, or web sockets; Google handles the underlying execution, tracing, and hosting infrastructure.

If your agent requires custom backend endpoints (Ex: exposing custom REST APIs alongside your chat stream, hosting specialized monitoring sidecars, or performing unique database connections), you deploy the agent as a standard containerized FastAPI application.

If you are operating in a highly restricted corporate network environment, need strict compliance constraints, or want to deploy your agent entirely offline/disconnected from Google Cloud, you can run the agent inside any Kubernetes cluster or local container runner (like Docker or Podman).

Choosing Gemini Enterprise (GE) as your deployment target is the cleanest production path available. Because Gemini Enterprise features a built-in A2UI renderer, your role as a developer changes fundamentally: you no longer need to write a custom client parsing engine or choose a frontend framework. Gemini Enterprise acts as the client-side browser layout renderer, natively translating your agent’s JSON blueprints into components that match the corporate design language.

When your agent interacts with Gemini Enterprise, the communication follows the Inline Message Pattern. This means your agent streams single JSON frames where the design parameters and text variables are packaged tightly together, allowing Gemini Enterprise to instantly map the layout elements.

+------------------------------------+                A2A Protocol                 +------------------------------------+|         Gemini Enterprise          |         (JSON-RPC 2.0 over HTTPS)           |          Our Custom Agent          ||    (Native UI Canvas Renderer)     |============================================>|        (Hosted on Cloud Run)       ||                                    |                                             |                                    ||   1. Captures prompt & sends its   |                                             |  3. Processes query with Gemini    ||      own supported component map   |                                             |  4. Bundles logic & secret tokens  ||                                    |<============================================|  5. Returns Inline A2UI Payload    ||   2. Paints native design nodes    |              A2UI Cargo Payload             |                                    |+------------------------------------+                                             +------------------------------------+

Here is the exact implementation path to package your Restaurant Finder agent code, expose it as an authorized A2A service endpoint, and register it inside the Gemini Enterprise environment.

Ensure your backend function handles standard dictionary parameters and maps the correct schema configurations required by the Gemini Enterprise interface layer.

from google.cloud import agent_development_kit as adkagent = adk.Agent(name="EnterpriseRestaurantAgent")@agent.actiondef search_dining_options(session_id: str, zone: str) -> dict:    """Returns a native A2UI presentation block directly to Gemini Enterprise."""        # We build an inline payload structure expected by the GE renderer    return {        "type": "surfaceUpdate",        "surfaceId": "ge-inline-view",        "version": "0.8",        "components": [            {                "id": "container-root",                "type": "VerticalLayout"            },            {                "id": "restaurant-choice",                "type": "ChoicePicker",                "parentId": "container-root",                "props": {                    "title": "Select an Italian Spot",                    "options": [                        {"label": "Pasta Paradiso (⭐ 4.9)", "value": "pasta_paradiso"},                        {"label": "Vino & Vedura (⭐ 4.6)", "value": "vino_vedura"}                    ],                    "actionId": "restaurant_selected" # Triggers a structured back-turn                }            }        ]    }

Because Gemini Enterprise accesses your runtime via standard HTTPS webhook signatures, hosting the script as an auto-scaling serverless container on Google Cloud Run is the standard configuration pattern.

Wrap your script configuration and deploy using the terminal interface:

To introduce your external running service to your corporate organization’s workspace catalog directory, you must run the registration handshake. If you are tracking the standard reference repository structure, this step is wrapped inside a central configuration script routine.

Run the registration call through your command-line workspace profile:

*(Alternatively, if utilizing the native codebase Makefile utilities from the repository setup, executing *make register-gemini-enterprise handles this programmatic validation handshake step for you automatically).

Once the endpoint mapping registration is confirmed by the system platform engine:

When your teammates log into their corporate chat workspace window next, your agent will be instantly visible in their sidebar choices.

Typing a dining location query automatically routes the execution traffic loop to your container service, streaming robust interactive layout widgets into the conversation thread without writing a single line of frontend client code.

Thank you so much for reading through to the end! If you found this architectural deep-dive into Generative UI and A2UI helpful, please give this article a round of claps 👏 and drop a comment below with your thoughts, questions, or what you’re planning to build next.

Your feedback and insights keep these discussions alive, and I can’t wait to hear how you’re leveraging these agentic frameworks in your own workflows. Let’s keep the conversation going — see you all in the next article discussion!!!

Zero Frontend Code: Deploying Interactive A2UI Agents Directly to Gemini Enterprise was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

source & further reading

pub.towardsai.net — original article OpenAI's GPT-5.6 Sol Hit 91.9% on Terminal-Bench — Then Cheated More Than Any Model METR Has Tested No, Your Chatbot Doesn’t Have Amnesia — It’s Drifting I Cracked Open Karpathy's $100 ChatGPT — the 2019 Original Cost $43,000 and 168 Hours

Zero Frontend Code: Deploying Interactive A2UI Agents Directly to Gemini Enterprise

Run your AI side-project on zahid.host