04:18
2026-05-26
ianbarber.blog
ai-chips
Elusive order of async GPU kernels: scheduling, abstractions, DSL implications
Nvidia's Hopper and Blackwell GPU architectures introduced spatial scheduling through warp specialization, requiring developers to divide pipeline work between different warp groups for data movement β¦