I've been running AI workloads on Azure Container Apps for over a year. Every time I spin up a new agent backend, the ritual is the same: create an environment, configure networking, set scaling rules, wire up health probes, then deploy the actual container. For a prototype agent that might live for a week, that's too much ceremony for what you get.
ACA Express, which hit public preview in May 2026, kills most of that ceremony. And a separate but related announcement, Docker Compose for Agents, brings MCP gateways and model serving to standard ACA environments. They solve different problems and run on different infrastructure, but together they cover the full spectrum of agent deployment on Azure.
Let me break down both.
Express is a new environment tier within Azure Container Apps. You bring a container image. Express handles provisioning, HTTPS, scaling (including scale-from-zero with subsecond cold starts), and resource allocation. No environment to manually provision through the portal. No networking to configure. No scaling rules to write.
Under the hood, Express is built on ACA Sandboxes, a platform primitive that uses prewarmed pools to deliver that subsecond startup. This isn't the standard ACA cold-start experience with a fresh coat of paint. It's a different architecture.
The tradeoffs are real. Express is HTTP workloads only, consumption CPU only. No GPU. No VNet integration. No Dapr. No service discovery between apps. No managed identity at runtime. No health probes. If you need any of those, standard ACA environments are still there. But for stateless HTTP agent backends, Express is dramatically faster to deploy and cheaper to run.
Here's what it takes to get a container running:
az containerapp env create \
--name my-express-env \
--resource-group rg-my-agents \
--environment-mode express \
--logs-destination none
az containerapp create \
--name my-agent-api \
--resource-group rg-my-agents \
--environment my-express-env \
--image mcr.microsoft.com/k8se/quickstart:latest \
--target-port 80 \
--ingress external \
--min-replicas 0 \
--max-replicas 1
Your app is running in seconds. Not minutes. Seconds.
Express also has its own portal experience at containerapps.azure.com, separate from the Azure portal. If you're using the portal, you don't even need to create the environment yourself. It handles that automatically.
Microsoft is explicitly positioning Express for two audiences: developers who want to ship fast, and AI agents that deploy endpoints on demand. That second audience is the interesting one.
Think about how modern agent architectures work. An orchestrator spins up tool-use APIs, runs them for the duration of a task, and tears them down. The infrastructure needs to provision fast, scale from zero, and cost nothing when idle. That's exactly the Express model.
The platform is designed for MCP servers, tool-use endpoints, multi-step workflow APIs, and human-in-the-loop UIs that agents spin up dynamically. Scale-from-zero with subsecond cold starts means you're not paying for agent backends that aren't actively serving requests. And when a request does come in, the agent is ready almost instantly instead of waiting through a cold start.
Here's where a lot of early coverage got confused, and where I got it wrong in my first draft of this post. Docker Compose for Agents is not an Express feature. It deploys to standard ACA environments with workload profiles, not to Express.
Why? Because Compose for Agents supports GPU model serving, MCP gateway containers, sidecar processes, and multi-service stacks. All of those require capabilities that Express doesn't have (workload profiles, service discovery, sidecars). Different tool for a different job.
What Compose for Agents does is let you take the same compose.yml
you use locally for development and deploy it directly to ACA. The CLI translates compose services into Container Apps resources automatically.
Here's what a compose file looks like for an agent stack:
services:
my-agent-app:
build: .
ports:
- "8080:8080"
environment:
- MCP_GATEWAY_URL=${MCP_GATEWAY_URL}
mcp-gateway:
image: docker/mcp-gateway
x-azure-deployment:
image: acateam.azurecr.io/preview-ai-compose/mcp-gateway:latest
models:
gemma:
model: ai/gemma3-qat
x-azure-deployment:
workloadProfiles:
workloadProfileType: Consumption-GPU-NC8as-T4
The x-azure-deployment
directive is the bridge between local and cloud. Docker ignores it locally. ACA uses it during deployment. Same file, both environments.
What the CLI creates behind the scenes:
Your agent app as a Container App with ingress. An MCP gateway running as its own Container App with managed identity, dynamically managing MCP tool containers. Model serving via Docker's model runner on serverless GPU. The MCP gateway handles stdio-to-SSE translation, so your MCP servers run as standard Container Apps without modification.
To deploy it:
az extension remove --name containerapp
az extension add --source "<preview-extension-url>" --yes
az containerapp compose create \
--compose-file-path compose.yml \
--resource-group rg-my-agents \
--environment my-standard-env
Notice that --environment
flag. This deploys to a standard ACA environment, not Express. That's the distinction.
The Azure AI hosting landscape has gotten crowded. Here's how I think about the options as someone who's deployed on most of them:
Azure AI Foundry is for when you want managed model endpoints with built-in safety, content filtering, and enterprise governance. You're consuming models, not hosting infrastructure.
ACA Standard is for when you need GPU workloads (self-hosted Ollama, vLLM), microservices with Dapr, VNet isolation, or any enterprise feature that Express doesn't have yet. This is also where Docker Compose for Agents deploys.
ACA Express is for fast, cheap, stateless agent backends. Prototypes, MCP servers, tool-use APIs, webhook handlers, agent orchestrators that don't need GPU compute.
ACA Dynamic Sessions is for sandboxed code execution for AI-generated code. Hyper-V isolated, millisecond provisioning, MCP-integrated.
Express isn't replacing anything. It's filling the gap for lightweight agent infrastructure that's too simple for standard ACA but too complex for a serverless function.
This is a public preview, and the supported feature list reflects that. The "No" column is long:
No secrets management (no Key Vault integration). No managed identity at app runtime. No health probes. No custom domains or managed certificates. No VNet integration. No CORS, session affinity, or sidecar containers. No OpenTelemetry. No autoscaling rules (KEDA). Region-limited to West Central US and East Asia.
For production agent backends, these gaps matter. No managed identity means you're passing credentials through environment variables. No health probes means you're trusting the platform's defaults. No secrets means API keys sit in plain text config.
But for prototypes, internal tools, and agent backends in active development? These limitations are acceptable tradeoffs for the provisioning speed and cost model. And Microsoft is shipping features on what they describe as a "rapid cadence" through the preview period.
If you're building a lightweight agent backend, an MCP server, or a tool-use API that handles HTTP requests and doesn't need GPU, go Express. You'll have a running endpoint in seconds with zero infrastructure decisions.
If you're building a full agent stack with model serving, an MCP gateway coordinating multiple tool containers, and GPU workloads, use Docker Compose for Agents on standard ACA. The compose file gives you local-to-cloud parity and the workload profiles give you the compute you need.
If you need both, use both. Express for the lightweight endpoints, standard ACA for the heavy lifting. They run on the same platform and can coexist in the same resource group.