Alibaba's Qwen team has released Qwen3.7-Plus, a multimodal agent model that combines visual perception, GUI operation, and coding in a single agent loop. In a demo, an agent built on the model autonomously developed a vocabulary learning app, producing over 10,000 lines of code across 1,000 agent calls over eleven hours. The model leads on-screen understanding in Qwen's own benchmarks, but overall performance is mixed. Qwen3.7-Plus is a proprietary offering with no open weights, priced well below Western frontier models.
The article Qwen3.7-Plus is Alibaba's bid to turn multimodal AI into a full-blown autonomous agent appeared first on The Decoder.