{"slug": "ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands", "title": "AI API Price War: DeepSeek V4-Pro Cuts 75% & Gemini 3.5 Flash Lands", "summary": "On May 31, 2026, DeepSeek made its V4-Pro API pricing permanent at a 75% discount, with output costs of $0.87 per million tokens—10x cheaper than GPT-4o and 5x cheaper than Claude Haiku 4.5. Meanwhile, Google launched Gemini 3.5 Flash at I/O 2026, offering multimodal support and a 1M context window at $9.00 per million output tokens, still 10x more expensive than DeepSeek for text-only tasks. The AI API price war is intensifying due to inference optimization, market competition, and developer demand for lower costs.", "body_md": "May 31, 2026 is shaping up to be a landmark day in the AI API market. Two developments are converging:\n\nThe message is clear: the AI API price war is no longer simmering — it's boiling over.\n\nBack on May 22, DeepSeek dropped a bombshell: V4-Pro API pricing would permanently lock in at roughly one-quarter of its original price. The 75% discount that was supposed to expire on May 31? It's now the permanent rate.\n\nHere's what the new pricing looks like:\n\n| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window |\n|---|---|---|---|\nDeepSeek V4-Pro |\n$0.435 | $0.87 | 128K |\nDeepSeek V3 |\n$0.14 | $0.28 | 64K |\nGemini 3.5 Flash |\n$1.50 | $9.00 | 1M |\nClaude Haiku 4.5 |\n$1.00 | $5.00 | 200K |\nGPT-4o |\n$2.50 | $10.00 | 128K |\n\nPricing accurate as of May 2026. Sources: official API docs and third-party aggregators.\n\nDeepSeek's V4-Pro output price of $0.87/M tokens is **10x cheaper than GPT-4o** and **5x cheaper than Claude Haiku 4.5**. For developers building AI agents, chatbots, or automated workflows that generate thousands of tokens per request, the savings compound fast.\n\nThis isn't just another \"we're reducing prices\" announcement. Three things make DeepSeek's move different:\n\nNot to be outdone, Google used I/O 2026 to unveil **Gemini 3.5 Flash**, and the numbers are impressive:\n\nGoogle is positioning Flash as the high-volume workhorse: fast enough for real-time applications, cheap enough to run at scale, and multimodal (text, vision, video, audio all supported natively).\n\nThe trade-off? At $9.00/M output, it's still **10x more expensive than DeepSeek V4-Pro** for pure text workloads. If your app doesn't need multimodal capabilities, the cost difference is hard to ignore.\n\nThis isn't random. Three structural forces are driving prices down across the board:\n\nTechniques like speculative decoding, quantization, and kernel fusion are squeezing more tokens per GPU-second. DeepSeek's own V4-Pro architecture is reportedly several times more inference-efficient than V3.\n\nThe market has gone from \"OpenAI and everyone else\" to a legitimate free-for-all:\n\nHN threads, Reddit discussions, and Twitter debates show that API pricing is a top-3 concern for AI builders. Providers who ignore pricing lose developer mindshare fast.\n\nHere's the practical takeaway for anyone building AI-powered applications:\n\n**If you're cost-sensitive (most of us are):**\n\nStart with DeepSeek V4-Pro. At $0.87/M output tokens, you can serve thousands of users before API costs become a concern. The OpenAI-compatible API means you can swap providers with minimal code changes.\n\n**If you need multimodal (vision, audio, video):**\n\nGemini 3.5 Flash is the obvious choice — native multimodal support with a 1M context window at competitive pricing. No other model in this price range handles images and video natively.\n\n**If you're in a regulated industry (GDPR, HIPAA):**\n\nConsider Claude via AWS Bedrock or Azure's managed offerings. The compliance overhead is worth the premium.\n\n**The hybrid approach (recommended):**\n\nUse DeepSeek V4-Pro as your default, with fallback to Gemini Flash for multimodal tasks. This gives you the best of both worlds: cheap text, powerful vision — and no single-provider lock-in.\n\n``` python\n# Example: Multi-provider routing with cost optimization\nimport openai\n\ndef route_request(prompt: str, needs_vision: bool = False):\n    if needs_vision:\n        client = openai.OpenAI(\n            base_url=\"https://generativelanguage.googleapis.com/v1beta\",\n            api_key=\"YOUR_GEMINI_KEY\"\n        )\n        model = \"gemini-3.5-flash\"\n    else:\n        client = openai.OpenAI(\n            base_url=\"https://api.deepseek.com/v1\",\n            api_key=\"YOUR_DEEPSEEK_KEY\"\n        )\n        model = \"deepseek-v4-pro\"\n\n    return client.chat.completions.create(\n        model=model,\n        messages=[{\"role\": \"user\", \"content\": prompt}]\n    )\n```\n\nHere's the uncomfortable truth behind all these price cuts: **cheap API access doesn't matter if you can't get access at all.**\n\nDeepSeek's official API still requires a Chinese phone number for registration. Google's API is geo-restricted in several regions. And most international developers can't pay with regional payment methods.\n\nThat's exactly the problem AiCredits was built to solve.\n\nWe provide **OpenAI-compatible access to DeepSeek V4-Pro** with:\n\nNeed stable DeepSeek API access?Try[AiCredits]— OpenAI-compatible, no Chinese phone number, PayPal accepted. Plans start at $3 for 5M tokens.\n\n*Originally published on AiCredits Blog.*", "url": "https://wpnews.pro/news/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands", "canonical_source": "https://dev.to/yanlong_wang/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-35-flash-lands-2nem", "published_at": "2026-06-22 01:21:39+00:00", "updated_at": "2026-06-22 01:39:43.175873+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-products", "ai-infrastructure", "developer-tools"], "entities": ["DeepSeek", "Google", "Gemini 3.5 Flash", "DeepSeek V4-Pro", "OpenAI", "GPT-4o", "Claude Haiku 4.5", "AWS Bedrock"], "alternates": {"html": "https://wpnews.pro/news/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands", "markdown": "https://wpnews.pro/news/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands.md", "text": "https://wpnews.pro/news/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands.txt", "jsonld": "https://wpnews.pro/news/ai-api-price-war-deepseek-v4-pro-cuts-75-gemini-3-5-flash-lands.jsonld"}}