Inside the cloud's new agentic AI-ready, Arm-powered foundation

Spotify reported 250% better performance on Google Cloud Axion processors built on Arm architecture, as hyperscalers like AWS, Microsoft, and Google shift to Arm-based compute for AI workloads. Arm's Neoverse platform now powers about half of all compute shipped to top hyperscalers, driven by the need for energy-efficient, high-performance infrastructure to support AI training and inference.

When Spotify evaluated its cloud compute options, it needed more than incremental improvements. Its recommendation engine delivers real-time suggestions to millions of users around the clock, placing heavy demands on compute infrastructure while requiring tight control over energy use and costs. During its evaluation of next-generation cloud processors, Spotify found that workloads running on Google Cloud Axion processors built on Arm architecture https://cloud.google.com/blog/products/compute/try-c4a-the-first-google-axion-processor delivered roughly 250 percent better performance. Axion is just a part of a broader shift toward Arm-based compute built on the Neoverse architecture, which has been adopted across all major hyperscale cloud platforms. AWS reports that its Arm-based Graviton processors have accounted for over half of new CPU capacity deployed over the past three years. https://www.aboutamazon.com/news/aws/aws-re-invent-2025-ai-news-updates Microsoft and Google have followed with their own Arm-based designs, including Azure Cobalt and Axion, while NVIDIA’s Grace https://www.nvidia.com/en-us/data-center/gb300-nvl72/ and Vera https://newsroom.arm.com/blog/arm-rubin-converged-ai-datacenter?utm source=theregister&utm medium=sponsored-content&utm content=longform txt register blog&utm campaign=mk35 cloudai cloud-ai thirdparty mediabuy na signal that it sees Arm as central to the future of AI infrastructure. Now about half of the compute shipped to top hyperscalers are Arm-based platforms. Purpose-built for customers Hyperscalers are not only deploying Arm processors but also designing silicon and infrastructure together to reflect real usage patterns. Ninety-eight percent of top 1,000 Amazon EC2 customers running production workloads on Graviton and benefit from Graviton’s price–performance advantages compared to x86. The new Cobalt 200 processor, built on Arm Neoverse technology, was engineered using telemetry from real Azure workloads and an internal suite of benchmark variants to reflect production behavior. Google is pursuing its own strategy with Axion processors, with C4A instances delivering up to 65 percent better price-performance and up to 60 percent greater energy efficiency than comparable x86 systems. At the core of this shift is Arm’s Neoverse platform, a datacenter–focused architecture designed to enable high-performance, energy-efficient compute at hyperscale. Neoverse marks Arm’s evolution from a mobile-first architecture to a platform purpose-built for cloud and AI infrastructure. It provides the common foundation hyperscalers use to design custom silicon optimized for their own workloads, allowing providers to tailor performance, power, and system behavior to meet specific application demands. While this momentum is driven by hyperscaler adoption, it is rooted in a broader change in how compute infrastructure must operate to support AI workloads. Traditional enterprise workloads emphasized predictable CPU utilization and storage throughput. AI changes that equation. Modern workloads require simultaneous optimization across training, inference, networking, and storage performance while minimizing energy consumption and latency. Even minor inefficiencies can become costly at scale. Power consumption now represents a significant portion of datacenter operating costs, which means performance per watt has become a primary design metric. According to an IDC report https://www.vertiv.com/49437d/globalassets/documents/reports/data-center-vision-how-data-center-infrastructure-will-evolve-to-support-ai-and-accelerated-compute us53192425-ib.pdf AI-ready datacenters are seeing rapid increases in power density, with rack requirements rising from typical levels of 5–10 kW to 30 kW or more, and in some cases exceeding 100 kW per rack. These constraints are forcing organizations to rethink how compute, networking, storage, and cooling systems are designed and integrated at the rack-level These pressures are also collapsing traditional boundaries between compute, networking, storage, and acceleration, creating tightly integrated systems optimized for end-to-end performance. This is driving cloud providers to adopt purpose-built silicon and architectures designed specifically for modern workloads. Real-world efficiency gains drive adoption These design choices are translating into measurable improvements in production environments. Organizations migrating workloads to Arm-based infrastructure are reporting gains across performance, efficiency, and cost: Databricks is using Azure Cobalt 100 virtual machines, built on Microsoft’s Arm-based CPU architecture, which are designed to optimize data-intensive and AI workloads. and deliver up to 50 percent better price-performance compared to previous generations, along with improvements in query speed and latency for analytics applications. For organizations running large-scale data pipelines to power machine learning and business intelligence workloads, these gains translate directly into faster processing and lower infrastructure costs. Pinterest provides a clear example of how Arm adoption can improve both cost efficiency and sustainability at scale. As a platform serving more than half a billion monthly active users and running AI-driven discovery workloads, Pinterest relies heavily on large-scale cloud infrastructure. By migrating workloads to AWS Graviton–based instances, the company achieved 38 percent savings on compute resources and 47 percent cost savings for key workloads, while also reducing carbon emissions by 62 percent. These improvements support both performance and sustainability goals, showing how infrastructure decisions can directly impact operational efficiency and environmental footprint. Uber’s transition to a multi-architecture environment highlights the operational realities of adopting Arm at scale. The company migrated more than 2,800 services and shifted nearly 20 percent of its infrastructure capacity from x86 to Arm-based processors, requiring updates to codebases, dependencies, and deployment pipelines. Through phased rollout, benchmarking, and continuous monitoring, Uber demonstrated that Arm can coexist with other architectures while improving price-performance and supporting a more flexible, efficient infrastructure model. Atlassian’s migration of Jira and Confluence to AWS Graviton highlights how Arm adoption can improve performance and efficiency at enterprise scale. The company moved more than 3,000 instances to Graviton-based infrastructure, achieving the transition with minimal impact on users. In production, instance counts dropped by around 30 percent, while throughput improved by up to 30 percent and latency decreased across key metrics. These gains demonstrate how optimizing infrastructure for performance per watt can enhance both user experience and cost efficiency at scale. These improvements span media streaming, data platforms, and large-scale consumer services, where gains in latency, throughput, and compute efficiency translate directly into lower infrastructure costs and improved user experience. They are particularly significant for AI inference, real-time personalization, and continuously running workloads. The converged AI datacenter The rise of agentic AI is transforming the datacenter into an integrated system in which CPUs, accelerators, networking, and storage operate as a unified platform. In these environments, CPUs serve as the control plane, coordinating scheduling, data movement, memory access, and system services, while accelerators handle compute-intensive training and inference tasks. In this model, efficiency is measured across the entire rack and datacenter footprint. AI workloads demand higher compute density while operating within fixed power and cooling limits, making the ability to maximize compute output per unit of space increasingly important. Coordinating CPUs, accelerators, memory, and networking as a unified system reduces bottlenecks and minimizes wasted energy from unnecessary data movement. Arm’s architecture spans these layers, enabling providers to optimize the full stack while maintaining software compatibility and ecosystem consistency. This cohesion is driving the emergence of the converged AI datacenter, where CPUs and accelerators are central to the trend. NVIDIA’s Grace Blackwell and Vera Rubin https://newsroom.arm.com/blog/arm-rubin-converged-ai-datacenter?utm source=theregister&utm medium=sponsored-content&utm content=longform txt register blog&utm campaign=mk35 cloudai cloud-ai thirdparty mediabuy na platforms combine Arm CPUs with high-performance GPU accelerators in rack-level solutions reflecting a broader industry move toward tightly integrated AI systems. In an other example, AWS with Trainium3 UltraServers https://www.aboutamazon.com/news/aws/trainium-3-ultraserver-faster-ai-training-lower-cost , pairs Arm-based Graviton CPUs with Trainium accelerators and Nitro networking components to support large-scale AI workloads. Similarly, Google’s latest TPU 8t and TPU 8i training and inference superpods https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/ are powered by Arm-based Axion CPUs, extending this trend toward purpose-built AI infrastructure optimized for scale, performance, and efficiency. In these architectures, Arm-based CPUs serve as the control layer, orchestrating data flow between accelerators, memory, and networking while simplifying development and driving optimization across software stacks and developer tooling. Migration realities: less friction than before Migration complexity has historically slowed adoption of new architectures. Today, improved tooling and ecosystem maturity are lowering that barrier. The Arm MCP Server https://developer.arm.com/servers-and-cloud-computing/arm-mcp-server integrates migration tools, compatibility checks, and performance analysis directly into AI-assisted workflows, helping developers analyze codebases, validate dependencies, and build multi-architecture environments. Programs such as the Arm Cloud Migration Program https://www.arm.com/markets/cloud-ai/arm-cloud-migration are also helping organizations accelerate this transition by providing guidance, validation, and tooling for production workloads. Arm adoption is supported by expanding software compatibility and platform support. Arm-based environments now support major Linux distributions, container platforms, and modern development frameworks. The ecosystem has matured significantly, enabling developers to focus less on compatibility and more on performance optimization. Arm’s ecosystem now spans more than 22 million developers worldwide. For developers, this shift means building and optimizing applications for multi-architecture environments, with greater emphasis on efficiency, concurrency, and performance tuning. Where cloud compute is heading Purpose-built compute is becoming the default model for AI era infrastructure. As performance improvements outpace increases in power consumption and cost, the economics of cloud computing are shifting toward efficiency-driven architectures. Looking ahead, this evolution is also extending to enterprise environments. Arm’s recently introduced Arm AGI CPU https://www.arm.com/company/arm-everywhere?utm source=google&utm medium=cpc&utm content=text txt na brand-event&utm campaign=mk30 brand-paid everywhere keyword traffic brand&utm term=arm%20agi%20cpu&gad source=1&gad campaignid=23672950280&gbraid=0AAAAADLDCY8Lr5T3Y4WZNJnf6FnSC af2&gclid=CjwKCAjw8uTQBhAdEiwAVvtJyu9m9zMt6tNitys7PBUjFpnp47TDc0tAYhZDChfzrVOybEFmMVdkQRoCdbkQAvD BwE is designed specifically for the next generation of AI-driven workloads, combining high single-thread performance with scalable throughput, compute density and rack level efficiency. Built on the Neoverse platform, it reflects the shift toward Arm CPUs that are not only optimized for general-purpose compute, but also engineered to orchestrate increasingly complex, agentic AI systems across the datacenter. Enterprises are increasingly evaluating infrastructure based on cost per workload, energy consumption, and the ability to scale within power and cooling constraints. This is driving demand for architectures that deliver predictable performance and efficiency across diverse workloads. Arm Neoverse’s growing momentum across hyperscalers, silicon vendors, and ecosystem partners reflects a broader realignment around efficiency, scalability, and system-level optimization. As AI workloads expand, infrastructure decisions will be shaped less by raw compute capacity and more by how efficiently systems can deliver performance at scale. The organizations redesigning cloud infrastructure today are not simply choosing new processors; they are adopting a compute foundation built for the demands of the AI era. Sponsored by Arm.