ChatGPT Expands Voice Input to 70+ Languages

OpenAI expanded ChatGPT's voice input to support over 70 languages with automatic language detection on Android and iOS, and released three new developer models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—for real-time translation and transcription. The updates lower barriers for multilingual voice interactions in consumer and enterprise applications.

ChatGPT Expands Voice Input to 70+ Languages Reporting by NokiaPowerUser on June 17, 2026 states that the ChatGPT app's microphone input now supports more than 70 languages , permits mixing languages within the same sentence, and includes automatic language detection on Android and iOS. Separate coverage from Reuters May 7, 2026 and industry outlets documents OpenAI's developer-facing GPT-Realtime family: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. Reuters reports that GPT-Realtime-Translate can translate speech from more than 70 input languages into 13 output languages , while GPT-Realtime-Whisper provides streaming transcription for live captions and notes. Tech outlets including Digital Trends and TechRadar describe GPT-Realtime-2 as enabling stronger live reasoning and longer context windows in voice interactions. Together, the app update and the Realtime models expand multilingual voice capabilities for end users and developers. What happened Reporting by NokiaPowerUser on June 17, 2026 states that the ChatGPT app's voice input was updated to support more than 70 languages , to accept freely mixed languages in the same utterance, and to perform automatic language detection on Android and iOS. Reuters reported on May 7, 2026 that OpenAI published three audio-focused models for developers, GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper, and that GPT-Realtime-Translate supports translation from more than 70 input languages into 13 output languages . Reuters also documented developer availability and pricing details for the Realtime models. Technical details Reporting by TechRadar and Digital Trends describes GPT-Realtime-2 as the voice model with what those outlets characterize as GPT-5-class reasoning and a larger context window reported at 128K tokens by Digital Trends . Digital Trends and Reuters report that GPT-Realtime-Translate handles live translation across 70+ input languages into a subset of output languages and that GPT-Realtime-Whisper provides streaming speech-to-text for live captions and meeting notes. Tech outlets note features for developers such as parallel tool calls and audible preambles that indicate processing steps, and Reuters lists example pricing tiers for the different Realtime endpoints. Industry context Editorial analysis: Multilingual, real-time voice capabilities lower a practical barrier for both consumer-facing assistants and developer-built voice apps. Industry reporting frames the combination of an updated consumer app ChatGPT and developer-facing realtime models as complementary: app updates improve immediate end-user accessibility, while the Realtime API enables third-party services customer support, travel, transcription to integrate low-latency translation and transcription. Implications for practitioners Editorial analysis: Engineers building voice interfaces should reassess language-coverage requirements, streaming pipeline architecture, and latency SLAs. Real-time translation and streaming transcription increase demand for robust audio preprocessing, continuous decoding, and hybrid on-device/cloud orchestration to meet responsiveness expectations cited in developer demos and press coverage. Evaluation workflows will need to include multilingual prompts and code-switching scenarios because outlets highlight mixed-language handling as a headline capability. What to watch - •Adoption metrics from developers and early enterprise customers mentioned in Reuters for example, Zillow and Priceline were cited as testers . - •Accuracy and latency benchmarks on conversational, code-switched audio versus established baselines; industry outlets reference BigBench Audio and other benchmarks in early coverage. - •App rollout notes and privacy documentation for microphone and language-detection features in the ChatGPT mobile app, since NokiaPowerUser reports the feature as a recent client-side update. Editorial analysis: For teams evaluating vendor models or integrating voice features, the current reporting suggests prioritizing tests that reflect real-world multilingual traffic and measuring user experience around language detection and seamless code-switching. Observers should track official OpenAI developer docs and app release notes for precise API behavior, quotas, and privacy controls. Scoring Rationale The combination of a consumer app update 70+ language voice input and developer Realtime models materially expands practical multilingual voice capabilities. That matters to engineers building voice interfaces and to researchers evaluating audio models, but it is an incremental capability extension rather than a frontier-model paradigm shift. Practice interview problems based on real data 1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with. Try 250 free problems /problems