Qwen3.7-Plus is Alibaba's bid to turn multimodal AI into a full-blown autonomous agent

Alibaba's Qwen team released Qwen3.7-Plus, a multimodal agent model that integrates visual perception, GUI operation, and coding into a single autonomous loop. In a demonstration, an agent built on the model independently developed a vocabulary learning app, generating over 10,000 lines of code across 1,000 agent calls over eleven hours. The proprietary model, priced below Western frontier models, leads on-screen understanding in Qwen's internal benchmarks but shows mixed overall performance.

Alibaba's Qwen team has released Qwen3.7-Plus, a multimodal agent model that combines visual perception, GUI operation, and coding in a single agent loop. In a demo, an agent built on the model autonomously developed a vocabulary learning app, producing over 10,000 lines of code across 1,000 agent calls over eleven hours. The model leads on-screen understanding in Qwen's own benchmarks, but overall performance is mixed. Qwen3.7-Plus is a proprietary offering with no open weights, priced well below Western frontier models. The article Qwen3.7-Plus is Alibaba's bid to turn multimodal AI into a full-blown autonomous agent https://the-decoder.com/qwen3-7-plus-is-alibabas-bid-to-turn-multimodal-ai-into-a-full-blown-autonomous-agent/ appeared first on The Decoder https://the-decoder.com .