GLM5.2

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

21:49

2026-07-03

wafer.ai

large-language-models

GLM5.2 on AMD MI355X at 2626 tok/s/node at over 2x lower cost than Blackwell

Wafer served GLM5.2 on AMD MI355X GPUs at 2626 tokens per second per node with over 2x lower cost than NVIDIA Blackwell, achieving 213 tok/s single stream. The company used MXFP4 quantization via AMD …

// co-occurs with top 7 entities

Wafer 1 AMD 1 NVIDIA 1 MI355X 1 Blackwell 1 TensorWave 1 sglang 1