Microsoft Says Latest AI Models Beat Claude, Google's Nano Banana

In brief

Microsoft said its new MAI-Thinking-1 model outperformed Anthropic’s Claude Sonnet 4.6 in blind evaluations and matched Claude Opus 4.6 on a leading coding benchmark.
The company said its MAI-Image-2.5 models surpassed Google’s Nano Banana 2 on image-editing leaderboards.
The launch marks Microsoft’s most ambitious effort yet to develop proprietary frontier AI models alongside its partnership with OpenAI.

On the first day of the annual Microsoft Build event on Tuesday, the Windows developer unveiled seven new AI models, claiming they outperformed Anthropic’s Claude Sonnet 4.6 and Google’s Nano Banana 2 in blind testing and image-editing benchmarks.

The claim comes as Microsoft attempts to establish itself as a frontier AI developer rather than solely OpenAI’s largest backer and infrastructure provider.

“Super excited to announce seven new world-class MAI models today,” Microsoft AI CEO Mustafa Suleyman wrote on X. “They represent what we consider a new era in AI designed to keep you in control and on the frontier.”

At the center of the release is MAI-Thinking-1, a reasoning model that Microsoft describes as its flagship text foundation model.

Seven new models launching at Build: let’s go!
Reasoning. Code. Image. Transcribe. Voice.

Built from scratch on a clean data lineage, designed for efficiency, working seamlessly as a family of models

Thread 🧵 #MSBuild pic.twitter.com/g3WQIcIQ24

— Microsoft AI (@MicrosoftAI) June 2, 2026

According to Suleyman, MAI-Thinking-1 was preferred over Anthropic’s Claude Sonnet 4.6 in blind tests conducted by independent evaluators. He added that the model scored 97% on AIME 2025, a benchmark that measures advanced problem-solving and reasoning skills.

Suleyman said the SWE Bench Pro result places the model “right alongside Opus 4.6 on one of the toughest coding benchmarks.”

The company also introduced MAI-Code-1-Flash, a lightweight coding model built for GitHub Copilot and Visual Studio Code; MAI-Image-2.5 and its Flash variant, which Microsoft says outperform Google’s Nano Banana Pro on image-editing tasks; MAI Transcribe-1.5, a transcription model that supports 43 languages; and MAI-Voice-2, a speech-generation model capable of producing natural-sounding voices in 15 languages and adapting to a speaker from a short audio sample.

“This is an extraordinary time in technology. The compute used to train frontier models has increased by a factor of one trillion,” Suleyman said in a separate blog post announcing the new models. “Now we expect another thousand-fold increase over the next three years, which in turn means more advanced capabilities, and the continued rollout of ever more effective AI.”

The announcement comes as competition among leading AI developers continues to intensify.

Last week, Anthropic announced the launch of its latest flagship model, Opus 4.8, which the company said is faster and smarter on benchmark tests and comes with a suite of new features. On Tuesday, Anthropic announced an expansion of its Project Glasswing, giving 150 companies access to its new cybersecurity-focused Mythos model.

Meanwhile, at Google I/O in May, Google unveiled Gemini Omni, a multimodal AI model that combines Gemini with the company’s Veo, Nano Banana, and Genie media-generation models, alongside Gemini Spark, a cloud-based AI agent designed to manage tasks across apps and workflows on a user’s behalf.

Microsoft’s new model launch suggests a broader effort to build proprietary AI systems as it expands beyond its longstanding reliance on OpenAI technology, saying that MAI “delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost.”

“Developers and businesses have been crying out for AI that delivers on their terms and under their say,” Suleyman wrote. “We see this as a major step towards delivering that.”

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Source link

Microsoft Says Latest AI Models Beat Claude, Google’s Nano Banana

In brief

Daily Debrief Newsletter

Be the first to comment

Leave a Reply Cancel reply

XRP Falling Channel Resolution Maps Path Toward $3.18