OpenAI reportedly cut response costs for guest ChatGPT users by more than half

wpnews.pro

cd /news/large-language-models/openai-reportedly-cut-response-costs… · home › topics › large-language-models › article

[ARTICLE · art-45304] src=the-decoder.com ↗ pub=2026-06-30T17:43Z topic=large-language-models verified=true sentiment=↑ positive

OpenAI reportedly cut response costs for guest ChatGPT users by more than half

OpenAI engineers cut inference costs for guest ChatGPT users by more than half, reducing the number of Nvidia GPUs needed to just a few hundred, according to a person familiar with the discussions. The optimization applies only to limited guest features, and it remains unclear whether the gains extend to the full product. The cost reduction could free up resources for scaling services or improving models.

read1 min views1 publishedJun 30, 2026

OpenAI engineers told colleagues earlier this month that they'd managed to cut inference costs—the expense of running existing AI models—by more than half. That's according to a person familiar with the discussions, as reported by The Information.

OpenAI applied the new optimizations to ChatGPT, specifically for visitors who don't have an account. The number of Nvidia GPUs needed to serve those users dropped to just a few hundred. It's not clear how many were required before or what techniques OpenAI used to pull it off. Guest users can only access a very limited set of ChatGPT features, so whether these gains would carry over to the full product is an open question.

Deepseek also just dropped a new open-source method that can speed up inference requests by 60 to 85 percent. The freed-up resources could go toward scaling services, better models, faster responses, or bigger margins. But since data center buildouts are moving slowly, gains like these will probably give labs more breathing room rather than cut into chip demand.

AI News Without the Hype – Curated by Humans

					Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.				

					Subscribe now

The Information

source & further reading

the-decoder.com — original article Google launches Nano Banana 2 Lite for fast AI images and Gemini Omni Flash for video via API Meituan's LongCat-2.0 shows China can train massive AI models without Nvidia San Francisco's AI boom is pricing out six-figure tech workers who can't find rent under $5,000

~/api · this article 200

$curl api.wpnews.pro/v1/news/openai-reportedly-cut-re…

Read original on the-decoder.com → the-decoder.com/openai-reportedly-cut-response-c…

mentioned entities

OpenAI

ChatGPT

Nvidia

The Information

Deepseek

metadata

slugopenai-reportedly-cut-response-costs-for-guest-chatgpt-users-by-more-than-half

topic#large-language-models

secondary3 topics

sentimentpositive

canonicalthe-decoder.com

navigation

← prevThe AI Compass

next →AI-fest på Wall Street – Space X…

── more in #large-language-models 4 stories · sorted by recency

runtimewire.com · 30 Jun · #large-language-models

Etched exits stealth with $800M raised and a broader inference bet

thenextweb.com · 30 Jun · #large-language-models

Meta paid contractors to pose as teens and probe rival AI chatbots

dev.to · 30 Jun · #large-language-models

I Spent $50K on AI APIs Last Year — Here's What I'd Do Differently as a...

newsletter.pragmaticengineer.com · 30 Jun · #large-language-models

Impressions from visiting OpenAI, Anthropic, & Cursor

── more on @openai 3 stories trending now

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 29 Jun · #large-language-models

The Silent Cost of AI Agents: Why Your Next.js SaaS Is Burning Money on LLM Calls

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required