10:37
2026-06-18
dev.to
large-language-models
Qwen3.6-35B NVFP4 runs on one H100 โ A100 owners are out
NVIDIA released Qwen3.6-35B-A3B-NVFP4, a post-training FP4-quantized variant of Alibaba's 35B MoE model that fits on a single H100 by reducing VRAM from ~71 GB to ~23 GB. The quantization targets weigโฆ