22:50
2026-06-27
github.com
large-language-models
Show HN: KV-psi, using Linux PSI to to trim an LLM KV cache
A developer released KV-psi, a reference implementation that uses Linux Pressure Stall Information (PSI) to trim an LLM KV cache under memory pressure. Benchmarks on an NVIDIA Jetson showed PSI-based โฆ