vLLM and PyTorch Work Together to Improve the Developer Experience on aarch64 PyTorch 2.11 now enables direct installation of CUDA-enabled PyTorch wheels on aarch64 Linux from PyPI, eliminating the need for custom package indexes and workarounds that previously complicated deployment on NVIDIA GH200, GB200, and GB300 systems. The packaging change improves the installation experience for vLLM users by preventing pip from silently replacing GPU builds with CPU-only wheels, a problem that forced vLLM to ship workarounds for over a year. Collaboration between vLLM and PyTorch through the PyTorch Foundation brought the fix to production after two years of development. Featured projects TLDR: PyTorch 2.11 makes it possible to install CUDA-enabled PyTorch wheels on aarch64 Linux directly from PyPI, eliminating the need for custom package indexes and workarounds that previously complicated deployment on systems such as NVIDIA GH200, GB200, and GB300. In this post, Kaichao You Inferact explains how this packaging change improves the installation experience for vLLM users and highlights how collaboration between vLLM and PyTorch through PyTorch Foundation helped bring the fix to production. A fix, two years in the making, that makes life much easier on GB200 / GB300 / GH200. An issue I first hit at a hackathon This story actually starts back in October 2024. I was at the CUDA MODE now GPU MODE IRL hackathon, trying to get vLLM running on a GH200 box. It should have been a five-minute job. Instead, I spent a frustrating chunk of the day staring at a pip install that, on the surface, looked perfectly fine — wheels were resolved, dependencies were satisfied, the install completed without errors — but at runtime torch.cuda.is available stubbornly returned False . The reason, once I dug in, was almost comically mundane: on aarch64 Linux, pip install torch was pulling the CPU-only wheel from PyPI. There simply was no GPU wheel for aarch64 published to the default PyPI index. To get a CUDA-enabled build, you had to explicitly point pip at the PyTorch download index: pip install torch --index-url https://download.pytorch.org/whl/cu128 That, by itself, would be only mildly annoying. The real damage came from how this interacted with transitive dependencies. PyPI does not let a package specify a custom index for its dependencies. So if any package in vLLM’s dependency tree declared a requirement of torch==