{"slug": "scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot", "title": "Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot", "summary": "Google has released a step-by-step guide for deploying AI agents built with the Agent Development Kit (ADK) onto Google Kubernetes Engine (GKE) Autopilot, using Gemini on Vertex AI as the core model. The deployment process packages the ADK-based Python application as a Docker image, runs it as a GKE Autopilot Deployment, and secures communication with Vertex AI through Workload Identity. The guide also details exposing the agent externally using the Kubernetes Gateway API, providing developers with a managed infrastructure path for production-ready AI agents.", "body_md": "While building AI agents locally using Google’s Agent Development Kit (ADK) is an excellent way to prototype, production-ready agents require a robust, scalable infrastructure. For developers looking to move beyond simple instances and into the world of managed container orchestration, Google Kubernetes Engine (GKE) Autopilot offers the perfect balance of flexibility and ease of use.\n\nIn this tutorial, I will walk you through building a technical agent with ADK and deploying it to GKE Autopilot. We will focus on utilizing Gemini on Vertex AI as the core model and ensure highest security standards by implementing Workload Identity for permission management.\n\nDeploying an ADK agent on GKE Autopilot involves more than just running a container. We leverage GKE's native capabilities to handle scaling and security. Our architecture consists of an ADK-based Python application packaged as a Docker image and stored in Artifact Registry. This container runs as a Deployment on GKE Autopilot, where it communicates securely with Vertex AI using Workload Identity—mapping a Kubernetes Service Account to a Google Cloud IAM Service Account.\n\nTo expose the agent to the world, we use the Kubernetes Gateway API, the modern successor to Ingress, which provides a cleaner separation of concerns and native support for Google Cloud Load Balancing.\n\nBefore we begin, ensure you have the following tools and accounts ready:\n\n`uv`\n\nfor package management.`gcloud`\n\n) installed and configured.`kubectl`\n\ncommand-line tool.`jq`\n\nfor parsing JSON responses.Before interacting with Google Cloud services, you must authenticate your environment and set the active project. This ensures that both the `gcloud`\n\nCLI and your local Python environment can access Vertex AI.\n\n```\ngcloud auth login\ngcloud config set project [PROJECT_ID]\ngcloud auth application-default login\nexport PROJECT_ID=$(gcloud config get-value project)\nexport REGION=us-central1\nexport CLUSTER_NAME=adk-cluster\n```\n\nGKE Autopilot is the recommended way to run Kubernetes without managing nodes. It allows you to focus on your agent deployment while Google manages the infrastructure. Starting the cluster creation now allows it to provision in the background while we build the agent.\n\n```\ngcloud container clusters create-auto $CLUSTER_NAME --region $REGION\n```\n\nWhile the cluster is provisioning, we can move on to building our agent.\n\nFirst, let's create our agent. Start by creating a folder for the agent code:\n\n```\nmkdir adk-agent\ncd adk-agent\n```\n\nInitialize a new Python project with uv:\n\n```\nuv init\n```\n\nAdd dependencies\n\n```\nuv add google-adk\n```\n\nCreate a new agent using the adk cli\n\n```\nuv run adk create weather_agent\n```\n\nYou will be asked to choose a model for the root agent. Choose `gemini-2.5-flash`\n\n(Number 1). Next you will be asked to choose a backend. Choose `Vertex AI`\n\n(Number 2). Next you will be asked to enter your Google Cloud project ID. Enter your project ID. Next you will be asked to enter your Google Cloud region. Choose a region of your choice. Example: `us-central1`\n\n.\n\nThe previous command scaffolded a new directory `weather_agent`\n\nwith the following structure:\n\n```\nweather_agent/\n├── .env\n├── __init__.py\n└── agent.py\n```\n\nADK requires the agent code to be in `agent.py`\n\nfile. Let's edit the `agent.py`\n\nfile to add a simple tool for the agent.\n\n``` python\n from google.adk import Agent\n# Define a simple tool for the agent\ndef get_weather(city: str) -> str:\n    \"\"\"Returns the current weather in a city.\"\"\"\n    return f\"The weather in {city} is 90 degrees Fahrenheit and sunny.\"\n# Initialize the agent with Vertex AI and Gemini\nroot_agent = Agent(\n    name=\"weather_agent\",\n    model=\"gemini-2.5-pro\",\n    tools=[get_weather]\n)\n```\n\nThe `agent.py`\n\nfile is the entry point for the agent. It is used to define the agent and its tools. The `get_weather`\n\nfunction is a simple tool that returns the current weather in a city. For the purpose of this tutorial, we are using a hardcoded value for the weather. In a real-world scenario, you would use an API to get the current weather.\n\nBefore deploying the agent to GKE Autopilot, we need to test it locally to ensure it works as expected. Run the following command to start the agent in debug mode with the web UI:\n\n```\nuv run adk web\n```\n\nOpen [http://localhost:8000](http://localhost:8000) in your browser and you should see the ADK web UI. You can then interact with your agent by typing messages in the chat interface.\n\nIf the agent returns a message like \"The weather in [CITY] is 90 degrees Fahrenheit and sunny.\" Congratulations! your ADK agent is working. Now you can proceed to the next step.\n\nThe ADK cli has a built-in command to deploy the agent to GKE Autopilot. However the default settings are not suitable for a production environment. For example, the default settings do not use Workload Identity for authentication with Vertex AI and to expose the Web UI via a Load Balancer on port 80.\n\nWe will instead manage the lifecycle of the container ourselves. First we need to containerize the agent.\n\nCreate a `.dockerignore`\n\nfile in the `adk-agent`\n\ndirectory to prevent your local virtual environment from being copied into the image:\n\n```\n.venv\n.adk\n__pycache__\n*.pyc\n.env\n```\n\nCreate a `Dockerfile`\n\nfor your agent in the `adk-agent`\n\ndirectory. We will use a multi-stage build to keep the final production image lightweight and secure:\n\n```\n# Stage 1: Build the virtual environment\nFROM python:3.10-slim AS builder\n\n# Install uv\nCOPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/\n\n# Set working directory\nWORKDIR /app\n\n# Force uv to use the system Python and use copy instead of symlinks\nENV UV_PYTHON_PREFERENCE=only-system\nENV UV_LINK_MODE=copy\nENV UV_COMPILE_BYTECODE=1\nENV UV_PYTHON=/usr/local/bin/python3\n\n# Install dependencies\n# We copy only files needed for installation to maximize cache\nCOPY pyproject.toml uv.lock ./\n# Note: We don't use --frozen yet as the host lock file might be slightly out of sync\n# but sync will update it in the builder stage.\nRUN uv sync --no-install-project --no-dev --no-cache\n\n# Copy the agent code\nCOPY . .\n# Sync the project itself\nRUN uv sync --no-dev --no-cache\n\n# Stage 2: Runtime image\nFROM python:3.10-slim\n\nWORKDIR /app\n\n# Copy the pre-built environment from the builder\nCOPY --from=builder /app/.venv /app/.venv\n# Copy the application code (including weather_agent folder)\nCOPY . .\n\n# Add the environment to the PATH\nENV PATH=\"/app/.venv/bin:$PATH\"\nENV PYTHONUNBUFFERED=1\n\n# Run the ADK API server\n# We point to the weather_agent folder\nCMD [\"adk\", \"api_server\", \".\", \"--host\", \"0.0.0.0\", \"--port\", \"8080\"]\n```\n\nBuild and push the image to Artifact Registry:\n\n```\n# Create repository\ngcloud artifacts repositories create adk-repo --repository-format=docker --location=$REGION\n\n# Build and push\ngcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest\n```\n\nSecurity is paramount. Instead of hardcoding API keys, we use Workload Identity to grant the GKE pod permission to access Vertex AI.\n\n**1. Create an IAM Service Account**:\n\n```\ngcloud iam service-accounts create adk-gke-sa\n```\n\n**2. Grant Vertex AI permissions**:\n\n```\ngcloud projects add-iam-policy-binding $PROJECT_ID \\\n\n    --member=\"serviceAccount:adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com\" \\\n    --role=\"roles/aiplatform.user\"\n```\n\n**3. Allow the Kubernetes Service Account to impersonate the IAM SA**:\n\n```\ngcloud iam service-accounts add-iam-policy-binding adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com \\\n    --role=\"roles/iam.workloadIdentityUser\" \\\n    --member=\"serviceAccount:$PROJECT_ID.svc.id.goog[default/adk-ksa]\"\n```\n\nNow, we define the Kubernetes resources. Create a `deployment.yaml`\n\nthat includes the Service Account annotation for Workload Identity. Replace `$PROJECT_ID`\n\nand `$REGION`\n\nwith your actual project ID and region.\n\n```\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: adk-ksa\n  annotations:\n    iam.gke.io/gcp-service-account: adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com\n---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: adk-agent\nspec:\n  replicas: 2\n  selector:\n    matchLabels:\n      app: adk-agent\n  template:\n    metadata:\n      labels:\n        app: adk-agent\n    spec:\n      serviceAccountName: adk-ksa\n      containers:\n      - name: adk-agent\n        image: $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest\n        resources:\n          requests:\n            cpu: \"500m\"\n            memory: \"512Mi\"\n          limits: \n            cpu: \"1\"\n            memory: \"1Gi\"\n        ports:\n        - containerPort: 8080\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: adk-service\nspec:\n  selector:\n    app: adk-agent\n  ports:\n  - port: 80\n    targetPort: 8080\n```\n\nApply the configuration:\n\n```\nkubectl apply -f deployment.yaml\n```\n\nCheck the status of the deployment:\n\n```\nkubectl get pods -w\n```\n\nOnce the pods are running, you can use kubectl port-forward to access the agent locally:\n\n```\nkubectl port-forward svc/adk-service 8080:80\n```\n\nSince we deployed the agent without Web UI, we can't access it at [http://localhost:8080](http://localhost:8080). However, we can still interact with it using the API and `curl`\n\n.\n\nIn a new terminal, run the following commands:\n\n```\n# Create a new session\ncurl -X POST http://localhost:8080/apps/weather_agent/users/u_123/sessions/s_123\n\n# Run a message\ncurl -s -X POST http://localhost:8080/run \\\n-H \"Content-Type: application/json\" \\\n-d '{\n\"appName\": \"weather_agent\",\n\"userId\": \"u_123\",\n\"sessionId\": \"s_123\",\n\"newMessage\": {\n    \"role\": \"user\",\n    \"parts\": [{\n    \"text\": \"Hey whats the weather in new york today\"\n    }]\n}\n}' | jq .\n```\n\nThe `curl`\n\ncommand will return the response in JSON format. The `jq`\n\ncommand is used to parse the JSON response and display it in a more readable format. . You should see a response like:\n\n```\n{\n    \"sessionId\": \"s_123\",\n    \"messages\": [\n        {\n            \"role\": \"assistant\",\n            \"parts\": [\n                {\n                    \"text\": \"The weather in New York today is sunny with a high of 90 degrees Fahrenheit.\"\n                }\n            ]\n        }\n    ]\n}\n```\n\nFinally, we expose the agent using the GKE Gateway API with a Google-managed TLS certificate. This is the recommended, production-grade approach — Google will automatically provision and renew the certificate for your domain.\n\nNB: GKE supports other options to provision certificates. You can use Let's Encrypt with cert-manager, pre-shared certificates, or any other certificate authority. You can check the [GKE documentation](https://docs.cloud.google.com/kubernetes-engine/docs/how-to/secure-gateway#secure-using-ssl-certificate) for more details.\n\nFirst, reserve a static IP address for your load balancer:\n\n```\ngcloud compute addresses create adk-agent-ip --global\nexport AGENT_IP=$(gcloud compute addresses describe adk-agent-ip --global --format=\"value(address)\")\necho \"Your IP: $AGENT_IP\"\n```\n\nPoint your domain's DNS `A`\n\nrecord at `$AGENT_IP`\n\n. Example: `adk.mydomain.com`\n\nCreate a Google-Managed Certificate. Replace `adk.yourdomain.com`\n\nwith your actual domain::\n\n```\ngcloud compute ssl-certificates create adk-cert --domains adk.yourdomain.com --global\n```\n\nCreate a `gateway.yaml`\n\nwith the following content:\n\n```\n# Gateway: HTTPS load balancer with the managed certificate and static IP\napiVersion: gateway.networking.k8s.io/v1\nkind: Gateway\nmetadata:\n  name: adk-gateway\nspec:\n  gatewayClassName: gke-l7-global-external-managed\n  listeners:\n  - name: https\n    protocol: HTTPS\n    port: 443\n    tls:\n      mode: Terminate\n      options:\n        networking.gke.io/pre-shared-certs: adk-cert\n  addresses:\n  - type: NamedAddress\n    value: adk-agent-ip\n---\n# HTTPRoute: forward traffic to the ADK service\napiVersion: gateway.networking.k8s.io/v1\nkind: HTTPRoute\nmetadata:\n  name: adk-route\nspec:\n  parentRefs:\n  - name: adk-gateway\n  hostnames:\n  - \"api.yourdomain.com\"\n  rules:\n  - backendRefs:\n    - name: adk-service\n      port: 80\n---\napiVersion: networking.gke.io/v1\nkind: HealthCheckPolicy\nmetadata:\n  name: adk-health\n  namespace: default\nspec:\n  default:\n    checkIntervalSec: 15\n    timeoutSec: 5\n    healthyThreshold: 1\n    unhealthyThreshold: 2\n    logConfig:\n      enabled: false\n    config:\n      type: HTTP\n      httpHealthCheck:\n        port: 8080\n        requestPath: /health\n  targetRef:\n    group: \"\"\n    kind: Service\n    name: adk-service\n```\n\nApply the configuration:\n\n```\nkubectl apply -f gateway.yaml\n```\n\nCertificate provisioning can take up to 20 minutes. Monitor the status with:\n\n```\ngcloud compute ssl-certificates describe adk-cert --global\n```\n\nOnce the status shows `Active`\n\n, your agent is live at `https://api.yourdomain.com`\n\n. You can test it with:\n\n```\n# Create a new session\ncurl -X POST https://api.yourdomain.com/apps/weather_agent/users/u_124/sessions/s_124\n\n# Run a message\ncurl -s -X POST https://api.yourdomain.com/run \\\n-H \"Content-Type: application/json\" \\\n-d '{\n\"appName\": \"weather_agent\",\n\"userId\": \"u_124\",\n\"sessionId\": \"s_124\",\n\"newMessage\": {\n    \"role\": \"user\",\n    \"parts\": [{\n    \"text\": \"Hey whats the weather in new york today\"\n    }]\n}\n}' | jq .\n```\n\nBy following these steps, you have successfully deployed a production-ready AI agent built with ADK onto GKE Autopilot that invokes Gemini on Vertex AI with Workload Identity for authentication. This setup ensures that your agent can scale horizontally to meet demand while maintaining a high security posture.\n\nAs you look ahead, consider integrating more complex tools or leveraging GKE's multi-cluster capabilities for even greater resilience. For more details on the technologies used here, explore the official [GKE documentation](https://cloud.google.com/kubernetes-engine/docs) and the [ADK repository](https://github.com/google/adk).\n\nTo avoid ongoing charges, remember to delete the GKE cluster and the Artifact Registry repository when finished:\n\n```\nkubectl delete -f gateway.yaml\nkubectl delete -f deployment.yaml\ngcloud compute addresses delete adk-agent-ip --global\ngcloud compute ssl-certificates delete adk-cert --global\ngcloud container clusters delete $CLUSTER_NAME --region $REGION\ngcloud artifacts repositories delete adk-repo --location $REGION\n```\n\n", "url": "https://wpnews.pro/news/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot", "canonical_source": "https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/", "published_at": "2026-06-04 07:00:00+00:00", "updated_at": "2026-06-04 15:41:11.678926+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "mlops", "artificial-intelligence"], "entities": ["Google", "Agent Development Kit", "GKE Autopilot", "Gemini", "Vertex AI", "Workload Identity", "Artifact Registry", "Kubernetes Gateway API"], "alternates": {"html": "https://wpnews.pro/news/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot", "markdown": "https://wpnews.pro/news/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot.md", "text": "https://wpnews.pro/news/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot.txt", "jsonld": "https://wpnews.pro/news/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot.jsonld"}}