11:20
2026-06-21
dev.to
large-language-models
AMD ATOM + ATOMesh: Prefill/decode Disaggregation on ROCm
AMD shipped ATOM + ATOMesh, a ROCm-native LLM serving stack for Instinct GPUs that implements prefill/decode disaggregation, splitting the two inference phases onto separate GPU pools to optimize for โฆ