03:47
2026-06-30
github.com
large-language-models
Fastllm: A LLM inference library that runs DeepSeek-V4 with 10GB VRAM
Fastllm, a C++ LLM inference library, now supports running DeepSeek-V4 and the full DeepSeek R1 671B model on a single GPU with just 10GB VRAM. The library is compatible with Nvidia, AMD, and domesticβ¦