cd/entity/A10Gยท homeโ€บ entitiesโ€บ A10G
grep -l @a10g /news/*.json | wc -l โ†’ 1

@A10G

mentions 1 type Organization feed RSS
17:26
2026-06-02
kyrieblunders.bearblog.dev
machine-learning

I made a kernel 2.2x faster. It made my training loop 3x slower

A developer wrote a fused decode-attention kernel that ran 2.2ร— faster than the baseline in microbenchmarks, but when integrated into a HuggingFace `generate` call for an RL training loop, the decode โ€ฆ

// co-occurs with top 6 entities