# Karpenter vs Cluster Autoscaler: Which to Use in 2026

> Source: <https://cast.ai/blog/karpenter-vs-cluster-autoscaler/>
> Published: 2026-06-30 16:31:25+00:00

** Karpenter** is an open-source node provisioner that provisions cloud instances directly, no node groups, no intermediary ASG call, just a direct API request for the optimal instance type selected at scheduling time.

**scales pre-defined node groups up and down in response to unschedulable pods. Both solve the same surface problem adding and removing nodes, but the mechanism is fundamentally different, and that difference shows up on your cloud bill. If you’re on AWS or Azure in 2026, Karpenter is the right default. If you’re on GKE, Cluster Autoscaler is still your only production option.**

[Cluster Autoscaler](https://cast.ai/blog/guide-to-kubernetes-autoscaling-for-cloud-cost-optimization/)(CAS)## The short answer

For most EKS and AKS teams, Karpenter is the better autoscaler in 2026. For GKE, Cluster Autoscaler remains the only production-ready choice.

- Karpenter provisions nodes in 45–60 seconds; Cluster Autoscaler typically takes 3–4 minutes
- Karpenter selects the cheapest fitting instance from any type; CAS is limited to your pre-defined node groups
- Karpenter actively consolidates running workloads onto fewer nodes; CAS does not repack existing pods
- Karpenter supports Spot natively with automatic on-demand fallback; CAS requires separate ASGs and manual mixed-instance policy
- AWS made Karpenter the default engine inside
[EKS Auto Mode](https://cast.ai/blog/eks-cluster-autoscaler-6-best-practices-for-effective-autoscaling/); Azure reached GA for AKS Node Auto Provisioning in early 2026 - No official Karpenter provider exists for GCP as of mid-2026; GKE teams stay on CAS

## How Cluster Autoscaler works

CAS polls the Kubernetes API every 10 seconds. When it finds pods in a `Pending`

state, meaning no node has enough available capacity to schedule them, it identifies which node group could accommodate those pods and triggers a scale-up event. The scale-up request goes to the cloud provider’s abstraction layer: an Auto Scaling Group on AWS, a Managed Instance Group on GCP, or a Virtual Machine Scale Set on Azure. The cloud provider spins up a node, Kubernetes registers it, and the scheduler places the pods. This roundtrip is why provisioning typically takes 3–4 minutes.

Scale-down works similarly. CAS identifies underutilized nodes, those where all pods could fit on other existing nodes, and drains them after a configurable idle period. The scale-down utilization threshold (default 50%) compares *requested resources* (pod `resources.requests`

) to node capacity, not actual CPU or memory usage. This is why a cluster with 8% actual CPU utilization can still appear ‘busy’ to Cluster Autoscaler: inflated resource requests look like real demand.

The hard constraint is node groups. You define them upfront: one group for `m5.xlarge`

, another for `c5.2xlarge`

, maybe a third for Spot. CAS can only scale within those boundaries. If your pending pod needs something between those sizes, CAS picks the closest group and wastes headroom. It cannot choose a different instance family on the fly. The `--expander=least-waste`

flag helps select which group to grow, but it only applies to new nodes – it does not repack workloads already running on existing machines.

```
containers:
  - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.0
    name: cluster-autoscaler
    command:
      - ./cluster-autoscaler
      - --cloud-provider=aws
      - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
      - --balance-similar-node-groups
      - --skip-nodes-with-local-storage=false
      - --expander=least-waste
      - --scale-down-utilization-threshold=0.5
```

CAS runs as a single replica – it is not horizontally scalable. At very large cluster sizes (500+ nodes with high churn), you will hit CPU and memory pressure on the CAS pod itself. This is a known operational limitation and the main reason platform teams at scale look elsewhere. The [Kubernetes Cluster Autoscaler GitHub](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) documents this explicitly.

## How Karpenter works

Karpenter watches for unschedulable pods, exactly like CAS, but it never talks to a node group. Instead, it calls the cloud API directly – on AWS, that’s the EC2 `RunInstances`

API – and selects the instance type that fits the pod’s requirements at the lowest cost. The selection happens from a broad candidate pool: all instance types allowed by your `NodePool`

requirements, filtered by availability and price. Provisioning latency is 45–60 seconds from pod pending to node ready. See the [Karpenter documentation](https://karpenter.sh/docs/) for the full provisioning flow.

The other half of Karpenter’s value is disruption – its term for active consolidation. With `consolidationPolicy: WhenEmptyOrUnderutilized`

, Karpenter continuously evaluates whether running workloads could fit on fewer nodes. When they can, it cordons the candidate node, reschedules the pods elsewhere, and terminates the node. This runs as a background loop, not just at scale-up time. Empty nodes get removed within seconds. Underutilized nodes get consolidated based on `consolidateAfter`

– in the example below, one minute.

Spot support is declarative. You list `spot`

and `on-demand`

as valid `capacity-type`

values in the NodePool requirements. Karpenter selects Spot when it’s available and cheaper, falls back to on-demand automatically when Spot capacity is unavailable. No separate node groups, no manual mixed-instance policy.

**Karpenter prerequisites:** Karpenter requires an IAM role with EC2 permissions (`RunInstances`

, `TerminateInstances`

, `DescribeInstances`

, plus SQS permissions for interruption handling). Configure this via Pod Identity (preferred on EKS 1.24+) or IRSA. See the [Karpenter getting-started guide](https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/) for the complete IAM setup.

```
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["c", "m", "r"]
          minValues: 2
        - key: "karpenter.sh/capacity-type"
          operator: In
          values: ["spot", "on-demand"]
        - key: "kubernetes.io/arch"
          operator: In
          values: ["amd64", "arm64"]
        - key: "karpenter.k8s.aws/instance-generation"
          operator: Gte
          values: ["5"]
      expireAfter: 720h
      terminationGracePeriod: 48h  # Increase for long-running batch jobs; default is much shorter
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m
    budgets:
      - nodes: 10%
      - schedule: "0 9 * * mon-fri"  # UTC — adjust for your local timezone
        duration: 8h
        nodes: "0"
  limits:
    cpu: "1000"
    memory: 1000Gi
```

The `nodeClassRef`

above points to an `EC2NodeClass`

named `default`

. That resource must also exist in your cluster. Here is a minimal definition:

```
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2023
  role: "KarpenterNodeRole-${CLUSTER_NAME}"
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
```

The EC2NodeClass defines the AMI, subnets, and security groups. Replace `${CLUSTER_NAME}`

with your EKS cluster name.

After applying both resources, verify Karpenter is active and watch provisioning events:

```
# Verify Karpenter NodePool is active
kubectl get nodepools

# Watch NodeClaims as pods are scheduled (shows provisioning events)
kubectl get nodeclaims -A -w

# Check Karpenter controller logs for provisioning decisions
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter --tail=50
```

A few operational notes for scale: the `budgets`

field controls blast radius. The example above caps voluntary disruption at 10% of nodes at any time and freezes disruption entirely during business hours Monday through Friday. At 500-node clusters, this matters – unconstrained consolidation can create cascading rescheduling waves that degrade pod startup latency. Set budgets deliberately.

## Cost: bin packing and consolidation

Average CPU utilization across Kubernetes clusters is just 8% – down from 10% year over year – while CPU overprovisioning jumped from 40% to 69%, according to the Cast AI 2026 State of Kubernetes Optimization Report, measured across tens of thousands of production clusters on AWS, GCP, and Azure. Those numbers are getting worse, not better. The choice of autoscaler is one direct input to that trajectory.

Cluster Autoscaler’s bin packing applies only at node addition time. When CAS scales up, its expander strategy (`least-waste`

, `most-pods`

, etc.) selects which node group to grow. But once a node is running and workloads drift – pods terminate, requests change, new pods land – CAS does not repack. The existing layout stays as-is until a node hits the scale-down utilization threshold. At 50% default, you’re paying for roughly half of every node’s capacity that’s idle.

Karpenter’s `WhenEmptyOrUnderutilized`

policy changes this. It actively evaluates whether existing workloads could be consolidated onto fewer nodes right now, not just at the next scale-up event. Combined with instance flexibility – picking from dozens of instance types instead of one pre-defined group – Karpenter’s effective utilization is structurally higher.

Even so, Karpenter alone does not solve the whole problem. A Cast AI benchmark measured a Karpenter baseline (default `WhenEmptyOrUnderutilized`

consolidation, without workload rightsizing) at roughly $703/week; adding Cast AI’s full optimization stack – which includes pod-level rightsizing on top of node consolidation – dropped costs to $400.83/week, a 43% reduction. Karpenter with Evictor accounts for 9.1% of that savings gap; the full stack reaches 43%. [See the full benchmark methodology](https://cast.ai/blog/karpenter-cost-optimization-consolidation-benchmark/). Karpenter is a better starting point than CAS, but active rightsizing at the pod level captures savings that node-level consolidation cannot reach. The utilization gap – 69% overprovisioning against 8% actual utilization – compounds annually (see Cast AI’s [complete autoscaling guide](https://cast.ai/blog/guide-to-kubernetes-autoscaling-for-cloud-cost-optimization/) for the full optimization ladder).

## Karpenter vs Cluster Autoscaler: comparison table

Use this table as the **Three Autoscaling Dimensions** framework: Speed (how fast does a node appear?), Flexibility (which instances can it pick?), and Consolidation (does it actively repack existing workloads?). Those three dimensions explain most of the cost and operational difference between the two autoscalers.

| Dimension | Karpenter (v1.13.0) | Cluster Autoscaler (v1.32.x) |
|---|---|---|
Provisioning model | Direct cloud API call; no node group required | Scales pre-defined node groups (ASG / MIG / VMSS) |
Speed | 45–60 seconds to ready | 3–4 minutes typical |
Instance flexibility | Any instance type matching NodePool requirements, selected at runtime | Limited to pre-defined node groups |
Consolidation / bin packing | Active – continuously repacks running workloads (`WhenEmptyOrUnderutilized` ) | Passive – expander strategies apply only to new nodes; no repack of existing pods |
Spot support | Native; declare `spot` in NodePool requirements, automatic on-demand fallback | Supported but requires separate Spot ASGs and manual mixed-instance policy |
Cloud support | AWS (full), Azure (GA early 2026), GCP (no official provider) | AWS, GCP, Azure, OpenStack, Alibaba, IBM, Cluster API |
Complexity | Higher initial setup; NodePool + EC2NodeClass CRDs; disruption budget tuning required | Lower initial setup; tag-based node group discovery; well-understood operational model |
Scalability | Event-driven; scales well to large clusters | Single replica; can become a bottleneck at very large scale |
Cost | Lower on AWS/Azure due to instance flexibility and active consolidation | Higher structural waste; does not repack; expanders help marginally |

## When to choose which

### Choose Karpenter if you’re on EKS or AKS

AWS standardized on Karpenter as the engine inside [EKS Auto Mode](https://cast.ai/blog/deploy-karpenter-eks-node-autoscaling/) (launched late 2024). If you’re starting a new EKS cluster, Karpenter is effectively the default – you would need to explicitly opt out and configure CAS instead. Azure reached GA for AKS Node Auto Provisioning (the karpenter-provider-azure) in early 2026. For both clouds, Karpenter is the production-supported path.

One caveat on EKS Auto Mode specifically: it runs Karpenter managed by AWS, which means Bottlerocket OS only, no SSH access to nodes, extra per-instance pricing, and automatic node rotation approximately every 21 days. If you need custom AMIs or direct node access, run self-managed Karpenter instead.

**Production Spot note:** For EKS clusters using Spot instances, Karpenter requires an SQS queue and EventBridge rules to receive EC2 Spot interruption notices. Without this setup, Karpenter falls back to IMDS polling (5-minute delay) and pods may receive a SIGKILL with no grace period on interruption. Configure the interruption queue at install time: `--set settings.interruptionQueue=${QUEUE_NAME}`

in the Helm install, plus an EventBridge rule pattern targeting `aws.ec2`

SpotInstanceInterruptionWarning events. See [Karpenter interruption docs](https://karpenter.sh/docs/concepts/disruption/#interruption).

### Choose Cluster Autoscaler if you’re on GKE

There is no official karpenter-provider-gcp in the kubernetes-sigs organization as of mid-2026. GKE teams are limited to CAS and GKE’s native Node Auto-Provisioning (which uses CAS internally). If your fleet is GCP-only, this decision is made for you. On GKE, tune your Cluster Autoscaler expander strategy (`--expander=priority`

with a configured priority map) for more predictable bin-packing, or explore [GKE’s Node Auto-Provisioning](https://cloud.google.com/kubernetes-engine/docs/concepts/node-auto-provisioning) for more dynamic node group creation within GKE’s ecosystem.

### Migrating from CAS to Karpenter

The [Karpenter documentation](https://karpenter.sh/docs/) includes a migration guide from CAS. You can run both simultaneously, but without explicit partitioning both controllers may attempt to act on the same pending pods, creating conflicts. Here are the concrete steps for a safe, incremental migration:

- Create a Karpenter NodePool with a taint to segregate new nodes: add
`taints: [{key: "karpenter.sh/managed", effect: NoSchedule}]`

to the NodePool spec. - Taint existing Karpenter-managed nodes to prevent CAS from touching them:
`kubectl taint nodes -l karpenter.sh/nodepool=default karpenter.sh/managed:NoSchedule`

- Add the matching toleration to workloads you’re migrating first.
- Verify no CAS-managed nodes remain:
`kubectl get nodes -l karpenter.sh/nodepool --show-labels`

- Once all workloads are on Karpenter-managed nodes, scale CAS deployment to 0:
`kubectl scale deployment cluster-autoscaler -n kube-system --replicas=0`

Plan for disruption budget tuning – the default settings work for most workloads, but stateful applications and ML jobs need explicit `budgets`

and `terminationGracePeriod`

configuration.

**PDB audit before enabling consolidation:** If your workloads have PodDisruptionBudgets with `maxUnavailable: 0`

or `minAvailable`

set to the total replica count, Karpenter consolidation will stall silently – the eviction is rejected but no error is surfaced. Run `kubectl get pdb -A`

before enabling `WhenEmptyOrUnderutilized`

and verify PDBs allow at least one pod to be temporarily disrupted.

## Beyond autoscaling: when the node autoscaler isn’t enough

Karpenter operates at the node level – it sees what pods *request*, not what they actually consume. A pod with `resources.requests.cpu: 2`

that runs at 0.2 cores looks like a busy node to any node autoscaler. Consolidation won’t touch it. At 8% average CPU utilization across production clusters, most of the waste lives in the pod spec, not the node inventory, and no amount of bin packing fixes that.

Cast AI adds pod-level rightsizing and Spot prediction on top of Karpenter’s provisioning engine. Rightsizing continuously adjusts resource requests to match actual consumption, shrinking the effective footprint that Karpenter then consolidates further. Cast AI’s Karpenter Enterprise Suite (KENT) adds these capabilities as an integration layer on top of Karpenter without replacing it; the full Cast AI autoscaler can also replace Karpenter entirely for teams that want a single unified stack.

The benchmark difference is measurable: Karpenter with Evictor saves 9.1% over baseline; KENT reaches 15.8%; the full Cast AI optimization stack reaches 43%. [See the full benchmark methodology](https://cast.ai/blog/karpenter-cost-optimization-consolidation-benchmark/). The gap between 9.1% and 43% is almost entirely pod-level rightsizing – savings that node consolidation cannot capture alone.

If you want to know what the remaining savings look like in your cluster, [request a cost audit from Cast AI](https://cast.ai/get-demo) – we’ll run a cost audit against your actual cluster before you commit to anything.

## FAQs

### What is the difference between Karpenter and Cluster Autoscaler?

Cluster Autoscaler adds and removes nodes within pre-defined node groups based on pending pods. Karpenter provisions any instance type directly from cloud capacity, selecting the optimal size just-in-time, and actively consolidates running workloads onto fewer machines. Karpenter bypasses the node group abstraction entirely, which is both why it’s faster and why it requires more deliberate configuration of NodePool requirements and disruption budgets.

### Is Karpenter cheaper than Cluster Autoscaler?

Typically yes, on AWS, for three reasons: better bin packing through active consolidation, native Spot support with automatic fallback, and instance flexibility that allows selecting the cheapest fitting type from a wide pool. In Cast AI’s benchmark, adding a full optimization stack on top of a Karpenter baseline – including pod-level rightsizing on top of node consolidation – cut costs by 43%. [See the full benchmark methodology](https://cast.ai/blog/karpenter-cost-optimization-consolidation-benchmark/). Cluster Autoscaler’s expander strategies only affect new node selection and have no effect on the existing workload layout. That said, neither tool addresses overprovisioning at the pod level, which is where a significant portion of waste accumulates.

### Does Karpenter work outside AWS?

Karpenter reached GA on Azure via AKS Node Auto Provisioning in early 2026. There is no official karpenter-provider-gcp in the kubernetes-sigs organization as of mid-2026. If you run GKE, Cluster Autoscaler (or GKE’s Node Auto-Provisioning, which is CAS-backed) is your only production-grade option. Community-maintained providers exist for other clouds but are not suitable for production without significant evaluation.

### Can I run Karpenter and Cluster Autoscaler at the same time?

Yes. Use node selectors and taints to partition which autoscaler manages which pods and nodes. Karpenter’s documentation includes a migration guide from CAS with specific steps for this partitioned approach. Running both without explicit partitioning risks conflicts: both controllers may attempt to act on the same unschedulable pods, leading to duplicate provisioning attempts or unexpected scale-down behavior. Partition first, then migrate workloads incrementally.