DeepSeek breaks China’s AI price war with peak-hour surge pricing

DeepSeek, the Chinese AI startup that ignited a price war by undercutting rivals, will double API prices for its V4 models during peak hours in Beijing, marking its first time-based surcharge. The move aims to manage GPU demand and ensure service stability, even as the company maintains off-peak rates and remains cheaper than Western competitors like OpenAI and Anthropic.

DeepSeek lit China’s AI price war by making tokens absurdly cheap. Now it is doing something no rival has dared: charging more when demand runs high. The Chinese startup has told API customers it will double the price of its V4 models during busy hours, according to the South China Morning Post https://www.scmp.com/tech/big-tech/article/3358868/after-triggering-price-war-deepseek-reverses-course-surcharge-peak-hour-api-use , which saw the email. The surcharge covers two windows, 9am to noon and 2pm to 6pm Beijing time. Outside those hours, prices hold. It is the first time DeepSeek has charged by the clock. The move surprises because it cuts against everything DeepSeek has done. The company spent the past year undercutting everyone, and it recently made a 75 per cent V4 discount permanent https://thenextweb.com/news/deepseek-v4-pro-75-percent-price-cut-permanent . Peak pricing pulls the other way. DeepSeek says the goal is “better distribution of resources” and steadier service, not profit. Either way, the cheapest name in AI just got dearer at the busiest time of day. DeepSeek shook the industry in early 2025. A cheap, capable model wiped hundreds of billions of dollars off US tech stocks in a single session. Its strategy since has stayed simple: undercut everyone, release the V4 models https://thenextweb.com/news/deepseek-v4-pro-flash-launch-open-source as open source, and win on price. Rivals such as Alibaba, Zhipu, and MiniMax were dragged into the same fight. What actually changes The sums are small in absolute terms, but the direction matters. For the flagship deepseek-v4-pro, output tokens rise from 6 yuan to 12 yuan per million during peak hours, roughly $0.85 to $1.70. Input costs double as well. The lighter deepseek-v4-flash follows the same pattern, from 2 yuan to 4 yuan per million output tokens. Off-peak, nothing moves. The new rates start when the full version of V4 goes live, which DeepSeek has said is due in mid-July, per Chinese tech media https://pandaily.com/deepseek-v4-official-july-peak-pricing-jun2026 . So this is less a one-off tweak than the pricing model for DeepSeek’s next flagship. Even doubled, DeepSeek stays cheap by Western standards. OpenAI and Anthropic charge many times more per token. The point is not that DeepSeek has turned expensive. It is that the company built on endlessly falling prices has just put a floor, and a peak, under them. Why the cheapest player blinked Surge pricing is a demand problem in disguise. When everyone hits the API at once, DeepSeek runs short of the GPUs it needs to serve them. Charging more at peak nudges some traffic into quieter hours. It is the logic Uber uses, applied to AI tokens. That points to a wider truth about this boom: serving AI is expensive, and getting more so. Renting chips keeps climbing, with Amazon recently raising GPU prices https://thenextweb.com/news/aws-gpu-prices-increase-memory-shortage as a memory shortage bites. Buyers have also learned that cheap headline token rates do not mean cheap bills, because heavy use and long outputs add up fast. The industry has watched token prices fall while spending rises https://thenextweb.com/news/token-prices-fell-98-enterprise-ai-bills-tripled-now-the-industry-wants-a-standards-body-to-explain-why . DeepSeek is not alone in rethinking how it charges. Anthropic recently shifted some customers to per-token pricing https://thenextweb.com/news/amazon-anthropic-token-pricing-openai-alternative , a change that pushed Amazon to hunt for cheaper options. The era of flat, ever-falling AI prices is starting to bend. DeepSeek has tried to ease the strain with engineering, too. Days before the surcharge news, it showed off DSpark, a speculative-decoding system it says speeds up responses by as much as 85 per cent while leaning less on top-end chips. Faster serving frees up capacity. Peak pricing rations what is left. The price war may be cooling For a year, Chinese labs raced each other to the bottom, and DeepSeek set the pace. Its cheap, open models forced rivals to follow. A surcharge, even a modest one, is the first sign that the race has limits. Developers noticed. The change stirred debate in Chinese tech circles, where some builders lean on DeepSeek precisely because it is predictable and cheap. Time-based pricing makes costs harder to plan, and it hands rivals an opening to promise flat rates instead. This is a small crack, not a reversal. DeepSeek is not abandoning cheap AI, and off-peak users will barely feel it. But the message to the market is plain. Even the company that made AI look almost free has to pay for the chips underneath it. When the bill comes due, someone covers it, and more and more that someone is the user who wants an answer at 10am. Get the TNW newsletter Get the most important tech news in your inbox each week.