Profile

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:23

2026-06-19

github.com

ai-infrastructure

Profile(v2.1.4) physics-aware optimizer for vLLM (31→470 tok/s on A100)

Profile v2.1.4, a physics-aware optimizer for vLLM inference servers, achieved a 15x throughput increase from 31 to 470 tok/s and a 93% cost reduction on an NVIDIA A100 GPU. The tool uses roofline mat…

// co-occurs with top 3 entities

vLLM 1 NVIDIA A100 1 Qwen3.6-27B 1