# Show HN: KV-psi, using Linux PSI to to trim an LLM KV cache

> Source: <https://github.com/infiniteregrets/kv-psi>
> Published: 2026-06-27 22:50:54+00:00

PSI KV Governor is a small reference implementation for using Linux Pressure Stall Information to trim an LLM KV cache when the system is under memory pressure.

- Linux with PSI enabled: cgroup
`memory.pressure`

or`/proc/pressure/memory`

- Python 3.10+
- llama.cpp build dependencies for the runner
- a GGUF model, for example
`models/SmolLM2-135M-Instruct-Q2_K.gguf`

Check PSI:

```
cat /proc/pressure/memory
PYTHONPATH=src python benchmarks/pressure_bench.py --preflight-only
```

Run the reference simulator:

```
PYTHONPATH=src python -m psi_kv_governor.cli simulate
```

Build the llama.cpp runner:

```
scripts/build_llama_runner.sh
```

Download the small benchmark model if needed:

```
python scripts/download_demo_model.py
```

Run both variant orders. This matters because PSI `avg10`

, cache, and zram/swap
state can carry over from the first pressure run into the second.

```
PYTHONPATH=src python benchmarks/pressure_bench.py \
  -c 2048 \
  -n 1536 \
  --keep 64 \
  --tail 256 \
  --min-prune 64 \
  --pressure-mib 6000 \
  --pressure-step-mib 1024 \
  --pressure-warmup-s 10 \
  --variant-cooldown-s 45 \
  --out-dir data/bench-pressure/fixed-first

PYTHONPATH=src python benchmarks/pressure_bench.py \
  --variant-order psi-first \
  -c 2048 \
  -n 1536 \
  --keep 64 \
  --tail 256 \
  --min-prune 64 \
  --pressure-mib 6000 \
  --pressure-step-mib 1024 \
  --pressure-warmup-s 10 \
  --variant-cooldown-s 45 \
  --out-dir data/bench-pressure/psi-first
```

Recent Jetson result:

| run | variant | decoded | tok/s | prunes | final KV | external PSI some/full |
|---|---|---|---|---|---|---|
| fixed-first | fixed | 1536 | 94.00 | 0 | 1547 | 1.61/1.61 |
| fixed-first | PSI | 1536 | 88.80 | 4 | 1291 | 4.14/3.94 |
| psi-first | PSI | 1536 | 96.16 | 2 | 1004 | 2.46/2.33 |
| psi-first | fixed | 1536 | 89.76 | 0 | 1547 | 5.56/5.56 |

Result directories:

`data/bench-pressure/real-psi-6000m-1536tok-cooldown`

`data/bench-pressure/real-psi-6000m-1536tok-cooldown-psi-first`
