The Nvidia H100 was the workhorse behind nearly every major language model trained between 2023 and 2025, and in 2026 it remains a central line item in any AI infrastructure budget. But H100 pricing is famously hard to pin down: there is no clean sticker price, rental rates swing widely by provider, and newer GPUs like the H200 and B200 are reshaping the value calculation. This guide lays out Nvidia H100 pricing in 2026 across buying, renting, and cloud, compares it to the rest of the lineup, and gives you a framework for the buy-versus-rent decision.
Key takeaway A single Nvidia H100 80GB costs roughly $30,000 to $40,000 to buy in 2026. Cloud rentals range from about $1 per GPU-hour on neo-cloud spot capacity up to $7.50 or more on hyperscalers, with specialized GPU clouds typically 50 to 75% cheaper than AWS, Azure, or Google for the same hardware. The H200 often beats the H100 on both price and performance for memory-bound inference, so check it before defaulting to H100.
How Much Does an Nvidia H100 Cost to Buy?
Buying outright is a major capital expense. A single H100 80GB GPU typically runs $30,000 to over $40,000, depending on the form factor (PCIe or SXM), vendor, and market demand. Nvidia does not publish formal list prices for these accelerators, so most figures come from resellers and leaks, which is part of why small teams struggle to predict GPU costs.
That price reflects what the card actually is: TSMC 4nm manufacturing, 80GB of HBM3 memory that alone costs several thousand dollars, 700W power delivery, NVLink interconnects, and full data-center validation. At the server level, an 8-GPU H100 board has been estimated around $216,000. Owning hardware also carries power, cooling, and operational overhead that belongs in any honest cloud versus on-premise comparison.
Nvidia H100 Cloud Rental Pricing in 2026
Renting is where most teams actually consume H100 capacity, and the spread is enormous. The representative on-demand and spot rates per GPU-hour in 2026 vary significantly by provider type. Neo-cloud spot instances start from around $1.03 per GPU-hour and are the cheapest option, though they are preemptible and best suited for fault-tolerant workloads. Specialized GPU cloud providers generally charge between $2.00 and $4.39 per GPU-hour and offer both on-demand and reserved cluster options. AWS on-demand pricing typically ranges from approximately $3.93 to $6.88 per GPU-hour, reflecting hyperscaler-grade reliability and integrations. Google Cloud is comparatively competitive among hyperscalers at around $3.00 per GPU-hour. Microsoft Azure sits at the high end, with rates around $12.29 per GPU-hour, making it the most expensive option but one that is often selected for high-availability requirements.
The pattern is consistent: hyperscalers are not the cheapest option for any GPU class in 2026. The lowest rates come from neo-clouds and marketplaces, and for interruption-tolerant workloads spot pricing leads. For workloads that cannot be interrupted, on-demand rates across the specialized providers tend to sit within about 20% of each other, so regional availability often matters more than the headline hourly cost.
H100 vs A100 vs H200 vs B200
The H100 no longer sits alone. Understanding where it fits against the rest of the lineup is the key to not overpaying.
The A100 80GB comes with 80GB of HBM2e memory, carries a lower purchase price than the H100, and typically rents for between $1.29 and $2.50 per GPU-hour. The H100 80GB uses 80GB of HBM3 memory, costs approximately $30,000 to $40,000 or more to purchase, and rents for roughly $1 to $7.50+ per GPU-hour. The H200 increases memory capacity significantly to 141GB of HBM3e, is priced modestly above the H100 when purchased, and typically rents for between $2.30 and $10.60 per GPU-hour. Nvidia's B200 (Blackwell) offers even higher memory capacity, generally costs between $30,000 and $50,000 to buy, and rents for approximately $2.12 to $18.00 per GPU-hour.
When each one wins
A100 is cheaper per hour, but the H100 delivers 3 to 5x better throughput on transformer workloads via its Transformer Engine. Cost per training run, not per hour, is what matters; a faster H100 job can be cheaper overall.
H200 has 76% more memory than the H100 (141GB vs 80GB) and more bandwidth, and starts cheaper per hour from some providers. For memory-bound inference, it is often the better buy on both price and performance.
B200 (Blackwell) carries a launch premium on both purchase and cloud rates, but for the largest workloads it is where the frontier is heading as availability scales.
Buy vs Rent: The Decision Framework
The buy-versus-rent question comes down to utilization and time horizon, not the hourly rate in isolation.
Rent when demand is variable, bursty, or experimental. Cloud GPUs avoid a six-figure capital outlay and let you scale up and down. Spot capacity suits fault-tolerant training and batch inference.
Buy when utilization is high and sustained. For steady, near-continuous workloads over multiple years, on-premise ownership is often the most cost-effective once you account for the full multi-year total cost of ownership.
Model the full TCO either way. On-premise must include power, cooling, networking, and staff; cloud must include egress and idle waste. The same discipline that governs cloud spend applies here, as we cover in our FinOps for AI token and GPU costs and cloud cost optimization guides.
Where H100 Pricing Is Heading
After a long period of scarcity and premiums, H100 rental rates have settled near multi-year lows, which makes 2026 a favorable time to rent rather than buy. As B200 and newer Blackwell parts become widely available, expect modest further softening on H100 rates, perhaps 10 to 20%, and small bulk-purchase discounts on the cards themselves. The practical implication is that locking into a large multi-year H100 purchase today carries more depreciation risk than it did a year ago, while flexible rental keeps your options open as the generation turns over.
How to Control GPU Costs
Shop beyond the hyperscalers. Neo-clouds and GPU marketplaces are routinely 50 to 75% cheaper for the same H100, so compare widely before committing.
Match the GPU to the workload. Use H200 for memory-bound inference, A100 where throughput needs are modest, and reserve B200 for genuinely frontier-scale jobs.
Use spot for interruption-tolerant work. Fault-tolerant training and batch inference can run on preemptible capacity at a fraction of on-demand rates. Measure cost per outcome. Track cost per training run or per million inferences, not just per GPU-hour, and attribute GPU spend to teams and projects, as covered in our cloud cost allocation guide.
Conclusion
Nvidia H100 pricing in 2026 is a tale of two numbers: $30,000 to $40,000 to own, or roughly $1 to $7.50 an hour to rent, with the rental market split sharply between cheap neo-clouds and expensive hyperscalers. The H100 is still the cost-effective default for large-scale training, but the H200 frequently wins on memory-bound inference and the B200 is climbing the frontier. With rates near multi-year lows and a new generation arriving, renting is the lower-risk choice for most teams, while sustained high-utilization workloads can still justify buying. Compare providers aggressively, match each GPU to its workload, and measure cost per outcome. If you want help attributing and optimizing GPU and cloud spend, that is exactly the discipline Opslyft brings.