Microsoft's new MAI models

Microsoft released two new text LLMs, MAI-Thinking-1 and MAI-Code-1-Flash, with the 35-billion-parameter reasoning model outperforming Sonnet 4.6 in blind evaluations. The company trained both models from scratch on enterprise-grade, commercially licensed data without distillation from third-party models, marking a potential shift toward legally clean training data for code-specialist AI.

Microsoft announced two new text LLMs https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/ this morning - MAI-Thinking-1 reasoning, 35B parameters, available to "select early partners" and It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4.6 in our blind human side-by-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop. Also of note https://microsoft.ai/news/introducing-mai-thinking-1/ : We trained MAI-Thinking-1 from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models. And for MAI-Code-1-Flash https://microsoft.ai/news/introducingmai-code-1-flash/ as well: It is built end-to-end by Microsoft using clean and appropriately licensed data. I would very much like to learn more about this "appropriately licensed" data Could these be the first generally useful code-specialist models that didn't train on an unlicensed dump of the web? Tags: llm-release https://simonwillison.net/tags/llm-release , generative-ai https://simonwillison.net/tags/generative-ai , ai https://simonwillison.net/tags/ai , microsoft https://simonwillison.net/tags/microsoft , llms https://simonwillison.net/tags/llms