cd/entity/ShardΒ· homeβ€Ί entitiesβ€Ί Shard
grep -l @shard /news/*.json | wc -l β†’ 1

Shard

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

19:14
2026-06-19
github.com
large-language-models

Pipeline-parallel LLM inference across GPUs on separate machines

A 744-billion-parameter GLM-5.2 model was served at ~30 tokens per second across six prosumer Blackwell GPUs in six US states over a wide-area network using pipeline parallelism and speculative decodi…

// co-occurs with top 6 entities