Slopsome.com - a free VRAM fit-calculator + real tokens/sec database for local LLMs

Slopsome.com launched a free VRAM fit-calculator and real tokens-per-second database for local LLMs, enabling users to check if a model runs on specific GPUs with given quantization and context length. The tool supports multi-GPU setups and provides measured throughput, side-by-side comparisons, and an open API.

Hey all, I built slopsome.com http://slopsome.com to answer the question I kept re-deriving by hand: will model X run on GPU Y at quant Q with a Z-token context, and how fast? It’s a search engine for LLM + GPU stats: a VRAM fit-calculator fits in VRAM / with offload / multi-GPU / won’t fit + estimated tok/s , real measured throughput, and side-by-side compares of open-weight and API models params, quant sizes, min VRAM, benchmarks, cost . Built for the GGUF / llama.cpp / Ollama / vLLM crowd. Free, no signup, sourced data no invented numbers . There’s also an open read-only API and a small HF Space demo. Feedback very welcome - wrong numbers, missing models/GPUs, features you’d want. Try the demo Space: slopsome — Will It Fit? - a Hugging Face Space by NexAIGuy https://huggingface.co/spaces/NexAIGuy/slopsome-will-it-fit Great Can you add multiple GPU capability? Eg. I have 2x 5060ti GPUs