Anthropic's agentic coding tool Claude Code has worked with Google Cloud for a while now. An individual developer could easily point CLAUDE_CODE_USE_VERTEX=1
at a Google Cloud (GCP) project, grant the role roles/aiplatform.user
, and inference stays inside your Google Cloud perimeter.
That flow works great when it’s just you, or a handful of engineers. But rolling it out across an organization forces you to deal with enterprise friction: you have to manage per-developer cloud credentials, push a managed-settings.json
to every laptop over MDM, and not be verified with zero per-developer usage attribution or easily enforceable spend caps.
The Claude apps gateway closes that gap. It is a self-hosted service, shipped with the same claude binary, that sits directly between your local Claude Code clients and Google Cloud. This post breaks down exactly why you should run it and what a secure deployment looks like on Google Cloud.
(Note: If you want to jump straight to the code, the full walkthrough lives in the Claude apps gateway on Google Cloud docs.) Run the gateway to centralize the governance that developers and platform admins otherwise each carry alone such as identity, policy, cost, and routing. Here's what that looks like in practice.
Identity. The /login
request routes through your identity provider (IdP ) - Google Workspace or any OIDC/OpenID Connect one - and the gateway swaps the token for a short-lived session. No sensitive information lands on the developer’s laptop — such as service-account keys, API keys, or ANTHROPIC_VERTEX_PROJECT_ID
. Onboarding is as simple as adding a user to an IdP group; offboarding by removing them, and their next session refresh fails on the spot.
Policy. Your RBAC (role-based access control) rules live once in gateway.yaml
, resolved per group and enforced server-side. The gateway re-checks availableModels
on every /v1/messages
call, so editing local managed-settings.json
changes nothing — and rule updates reach the whole fleet within the hour.
Telemetry. Every claude_code.token.usage
metric carries the verified email and groups from the session JWT (signed session token), not the spoofable client-set OTEL_RESOURCE_ATTRIBUTES
. The gateway ships them over OTLP/HTTP to a collector you run — Cloud Monitoring, Grafana, Datadog, whatever you use.
Spend limits. Set daily, weekly, or monthly caps per user, group, or org via the admin API; the gateway meters tokens against a Cloud SQL ledger and returns a 429 at the cap. Costs are at list price, so treat them as a runaway-usage guardrail, not a bill reconciliation (committed-use discounts and negotiated rates don't show up).
Routing. Calls go out under a single Cloud Run service identity. Set region: global
for Agent Platform's global endpoint, or add a second upstreams:
entry to fail over on 5xx/429/timeout in list order. Either way, inference stays in your GCP project — quota, Data Processing Agreement, and billing all unchanged.
A developer's local or deployed claude
process sends inference traffic to the gateway over HTTPS. The gateway is a stateless container on Cloud Run as shown below.
The gateway validates its own session bearer — Google Workspace is only contacted at sign-in and token refresh — checks policy, and forwards the request to Agent Platform using the Cloud Run service account. Cloud SQL holds device-code sign-in state and the spend ledger; an OTLP collector receives the attributed metrics.
The full walkthrough, every gcloud command and the complete gateway.yaml
reference, is in the Claude apps gateway on Google Cloud docs. The short version: Step 1: Provision the GCP foundation Enable the Agent Platform, Cloud SQL, and Secret Manager APIs; create a
claude-gateway
service account with roles/aiplatform.user
; stand up a small Cloud SQL Postgres database instance for state. The gateway authenticates to Agent Platform as the Cloud Run service identity — you do client_id
and
client_secret
for that handshake. Those two values feed the oidc
: block in the next step. You'll later add the authorized redirect URI once the gateway URL is known.Step 2: Configure the gateway Write
gateway.yaml
pointing at your Google Workspace OIDC client, the Postgres connection string, and Agent Platform as the upstream. Store it in Secret Manager, along with the OIDC client secret, the Postgres URL, and a JWT signing key.Then register https://<public_url host>/oauth/callback
as an authorized redirect URI on the Google OAuth client — it must match listen.public_url exactly:
Step 3: Deploy to Cloud Run
gcloud run deploy
with the service account attached, the Cloud SQL connection on the VPC, and the config mounted from Secret Manager. The container is stateless and scales horizontally behind the Cloud Run load balancer. GKE works equally well if that's already your platform, and only the deployment manifest changes.Developers connect over the corporate network; you may front the service with an internal Application Load Balancer — see Cloud Run private networking.
Either public or internal, your developers must be able to access whatever URL you configure or you can rely on the default URL from Cloud Run. For the below example we will use https://claude-gateway.example.internal
Step 4: Onboard a developer Push
forceLoginMethod: "gateway"
and forceLoginGatewayUrl
to developer machines via managed settings. This is how
/login
knows where to connect, with no manual URL entry. For an org rollout, that's your MDM channel. For a first trial without MDM, the developer can write the file by hand at /Library/Application Support/ClaudeCode/managed-settings.json
on macOS (or /etc/claude-code/managed-settings.json
on Linux) if they have local admin permissions:At Claude Code startup, the developer then presses Enter on the pre-filled gateway sign-in screen to confirm the URL.Confirm the device code on the gateway's verification page in the browser, and get redirected to Google Workspace to sign in. After that, the developer completes the device-code flow in the browser against Google Workspace. If setup ends correctly, you will be able to see Cloud Gateway in the terminal view as shown below.
At this point you should have a better understanding of how to configure and use Claude apps gateway on Google Cloud. Here are some next steps you may want to consider:
Full config reference: every gateway.yaml
field is in claude-apps-gateway-config. Per-IdP setup and the GKE track live in claude-apps-gateway-deploy and claude-apps-gateway-on-gcp.
Group-scoped policies: front the gateway with a groups-capable IdP, set groups_claim
, and add match: { groups: [...] }
policies above the catch-all to give different teams different model lists and tool permissions.
For now, thanks for reading! And if you have any additional questions or feedback, feel free to reach out on socials (Roy Arsan - Linkedin, X and Ivan Nardini - LinkedIn, X) Happy building!