May 31, 2026 is shaping up to be a landmark day in the AI API market. Two developments are converging:
The message is clear: the AI API price war is no longer simmering — it's boiling over.
Back on May 22, DeepSeek dropped a bombshell: V4-Pro API pricing would permanently lock in at roughly one-quarter of its original price. The 75% discount that was supposed to expire on May 31? It's now the permanent rate.
Here's what the new pricing looks like:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window |
|---|---|---|---|
| DeepSeek V4-Pro | |||
| $0.435 | $0.87 | 128K | |
| DeepSeek V3 | |||
| $0.14 | $0.28 | 64K | |
| Gemini 3.5 Flash | |||
| $1.50 | $9.00 | 1M | |
| Claude Haiku 4.5 | |||
| $1.00 | $5.00 | 200K | |
| GPT-4o | |||
| $2.50 | $10.00 | 128K |
Pricing accurate as of May 2026. Sources: official API docs and third-party aggregators.
DeepSeek's V4-Pro output price of $0.87/M tokens is 10x cheaper than GPT-4o and 5x cheaper than Claude Haiku 4.5. For developers building AI agents, chatbots, or automated workflows that generate thousands of tokens per request, the savings compound fast.
This isn't just another "we're reducing prices" announcement. Three things make DeepSeek's move different:
Not to be outdone, Google used I/O 2026 to unveil Gemini 3.5 Flash, and the numbers are impressive:
Google is positioning Flash as the high-volume workhorse: fast enough for real-time applications, cheap enough to run at scale, and multimodal (text, vision, video, audio all supported natively).
The trade-off? At $9.00/M output, it's still 10x more expensive than DeepSeek V4-Pro for pure text workloads. If your app doesn't need multimodal capabilities, the cost difference is hard to ignore.
This isn't random. Three structural forces are driving prices down across the board:
Techniques like speculative decoding, quantization, and kernel fusion are squeezing more tokens per GPU-second. DeepSeek's own V4-Pro architecture is reportedly several times more inference-efficient than V3.
The market has gone from "OpenAI and everyone else" to a legitimate free-for-all:
HN threads, Reddit discussions, and Twitter debates show that API pricing is a top-3 concern for AI builders. Providers who ignore pricing lose developer mindshare fast.
Here's the practical takeaway for anyone building AI-powered applications:
If you're cost-sensitive (most of us are):
Start with DeepSeek V4-Pro. At $0.87/M output tokens, you can serve thousands of users before API costs become a concern. The OpenAI-compatible API means you can swap providers with minimal code changes.
If you need multimodal (vision, audio, video):
Gemini 3.5 Flash is the obvious choice — native multimodal support with a 1M context window at competitive pricing. No other model in this price range handles images and video natively.
If you're in a regulated industry (GDPR, HIPAA):
Consider Claude via AWS Bedrock or Azure's managed offerings. The compliance overhead is worth the premium.
The hybrid approach (recommended):
Use DeepSeek V4-Pro as your default, with fallback to Gemini Flash for multimodal tasks. This gives you the best of both worlds: cheap text, powerful vision — and no single-provider lock-in.
import openai
def route_request(prompt: str, needs_vision: bool = False):
if needs_vision:
client = openai.OpenAI(
base_url="https://generativelanguage.googleapis.com/v1beta",
api_key="YOUR_GEMINI_KEY"
)
model = "gemini-3.5-flash"
else:
client = openai.OpenAI(
base_url="https://api.deepseek.com/v1",
api_key="YOUR_DEEPSEEK_KEY"
)
model = "deepseek-v4-pro"
return client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}]
)
Here's the uncomfortable truth behind all these price cuts: cheap API access doesn't matter if you can't get access at all.
DeepSeek's official API still requires a Chinese phone number for registration. Google's API is geo-restricted in several regions. And most international developers can't pay with regional payment methods.
That's exactly the problem AiCredits was built to solve.
We provide OpenAI-compatible access to DeepSeek V4-Pro with:
Need stable DeepSeek API access?Try[AiCredits]— OpenAI-compatible, no Chinese phone number, PayPal accepted. Plans start at $3 for 5M tokens.
Originally published on AiCredits Blog.