GLM 5.2 Fast via Wafer now available on AI Gateway

Vercel's AI Gateway now offers GLM 5.2 Fast via Wafer, delivering 2x higher throughput than other providers in benchmarking tests. The model achieves over 170 tok/s for small contexts and over 200 tok/s for large contexts, with no markup or platform fees.

GLM 5.2 Fast via Wafer is now available on AI Gateway https://vercel.com/ai-gateway . Based on our own benchmarking across small-context, large-context, and tool-call scenarios, Wafer delivers a 2x higher throughput than other providers serving GLM-5.2 on serverless, leading on decode and end-to-end speed for sustained generation in the small- and large-context cases. In our testing, GLM 5.2 Fast on Wafer measured: Small context: 170+ tok/s Large context: 200+ tok/s To use GLM 5.2 Fast, set model to zai/glm-5.2-fast in the AI SDK https://ai-sdk.dev/ : AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting https://vercel.com/changelog/custom-reporting-ai-gateway , Zero Data Retention support https://vercel.com/blog/zdr-on-ai-gateway , budgets for API keys https://vercel.com/docs/ai-gateway/authentication-and-byok/api-keys , and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on Bring Your Own Key https://vercel.com/docs/ai-gateway/authentication-and-byok/byok BYOK requests. Try GLM 5.2 Fast in the model playground https://vercel.com/ai-gateway/models/glm-5.2-fast .