Thank you for the clarification.
The difficulty for us is that revisiting the Realtime prompting guide is unlikely to solve the core issue. We have already spent thousands of hours internally prototyping, testing, and comparing different Realtime/audio-capable model configurations for Romanian production voice-agent use cases.
The stack we currently run, based on gpt-realtime-mini-2025-10-06
, is not something we selected casually. It is the result of substantial internal R&D, repeated production-like testing, and comparison against other available models. For our specific use case — Romanian AI voice agents that need to remain faithful to database-fed business information — this snapshot has been by far the most optimal option.
The issue we are seeing with the replacement model is not just something that can be corrected with minor prompt changes. The newer gpt-realtime-mini
shows worse Romanian/non-English quality and, more importantly, worse faithfulness to supplied business data. In our tests, it has hallucinated non-existing departments, services, or operational details that were not present in the database/context, while the older snapshot behaved much more reliably.
That reliability is precisely what allowed us to enter the market quickly and attract a large number of voice and call-center clients, both small and large. Although we have not been in the market for long, this OpenAI Realtime-based stack enabled us to grow quickly and explosively, and it is now central to several major enterprise deployments.
This is why the scheduled shutdown is such a serious concern for us. We are currently operating on enterprise deployments where the model behavior is not a minor implementation detail, but the foundation of the product’s trustworthiness. A forced migration to a replacement model with materially worse Romanian/non-English performance could jeopardise major contracts we are currently involved in along with future ones we’re negotiating.
It also affects OpenAI commercially. Based on our current pipeline, we project our OpenAI API usage could reach the ~$50,000/month threshold by Q3 as these deployments scale. Our preference is to continue building and scaling on OpenAI’s Realtime infrastructure, but we need a reliable migration path before moving production traffic away from the validated snapshot.
That is why we were hoping to speak with someone from OpenAI who could look at this from a production/API customer perspective, not only as a general prompting issue. Ideally, we would like to understand whether OpenAI can consider one of the following:
delaying the shutdown of `gpt-realtime-mini-2025-10-06`
for production users affected by language-specific regressions;
providing temporary extended access while regressions are investigated;
recommending or releasing an alternative Realtime/audio-capable model with equivalent Romanian/non-English faithfulness;
or routing this as a production-impacting model quality regression for Realtime API users.
We can provide side-by-side transcript and audio recorded evidence comparing `gpt-realtime-mini-2025-10-06`
and `gpt-realtime-mini`
under comparable flows/configuration.