cd /news/artificial-intelligence/microsoft-readies-new-mai-voice-and-… · home topics artificial-intelligence article
[ARTICLE · art-18815] src=testingcatalog.com pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Microsoft readies new MAI voice and image models for Build 2026

Microsoft will preview three new AI models at its Build conference on June 2 in San Francisco, including MAI-Transcribe-1.5, MAI-Image-2.5, and MAI-Voice-2, none of which are publicly available yet. The models are designed to power Copilot, Teams, and Azure Speech as the company reduces its reliance on OpenAI following April's renegotiation. MAI-Voice-2 expands to 15 languages with broader emotional range, while MAI-Image-2.5 will ship in two variants and accept image uploads for editing.

read1 min publishedMay 30, 2026

Microsoft heads into its Build conference on June 2 in San Francisco with more in its model pipeline than the MAI-Image-2.5 that it has already shown on Arena, where the text-to-image system landed third behind OpenAI’s gpt-image-2 and Google’s Nano Banana 2. That release is lined up for the MAI Playground and Foundry, but three additional models are taking shape within the company’s stack, none of which are publicly available yet.

The first, MAI-Transcribe-1.5, is a modest step up from the speech-to-text model launched in April, which already claimed the lowest word error rate across 25 languages. The image side draws more attention: MAI-Image-2.5 looks set to ship in two variants, a high-quality version and a faster one labeled MAI-Image-2.5e, mirroring the split seen with MAI-Image-2. It would also accept image uploads, opening the model to editing as well as generation, putting it on par with rivals from Google and OpenAI.

The most striking find is MAI-Voice-2, a multilingual successor to the company’s text-to-speech model. While MAI-Voice-1 began in English, the new version adds German, Australian and US English, Spanish, French, Hindi, Indonesian, Italian, Japanese, Korean, Dutch, Portuguese, Turkish, Vietnamese, and Chinese, with a wider emotional range that covers tones such as angry, confused, and embarrassed. Early samples suggest it can whisper, too.

All three would feed Copilot, Teams, and Azure Speech, and fit the developer crowd that Build is made for. The timing matches a broader push, as Mustafa Suleyman’s team weans the company off OpenAI following April’s renegotiation. Reports point to a homegrown coding model for GitHub Copilot at the show, too, while a Copilot “super app” that integrates chat, coding, and agents into a single hub is expected later in the summer.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/microsoft-readies-ne…] indexed:0 read:1min 2026-05-30 ·