DGX Spark hitting 83 C under sustained Ollama load — solved by clock-locking via nvidia-smi -lgc

A developer created a daemon to reduce GPU temperatures on the NVIDIA DGX Spark by clock-locking via nvidia-smi -lgc. The daemon samples temperature every 30 seconds and adjusts clock ceilings, dropping sustained temperatures from 83°C to 72°C under heavy Ollama workloads. The solution addresses the lack of user-exposed power-limit or fan-curve controls on the GB10 GPU.

TL;DR: GB10 in the DGX Spark has no user-exposed power-limit or fan-curve control nvidia-smi returns N/A for both — firmware-managed . But nvidia-smi --lock-gpu-clocks DOES work. I wrote a tiny daemon that samples temp every 30s and steps the clock ceiling down 150 MHz whenever it enters the warning band, then relaxes it back up after 3 consecutive cool samples. Ollama gpt-oss:120b + qwen2.5:72b workload — dropped from 83 °C → 72 °C, sustained, same util. My DGX Spark serving Ollama ~40 GB VRAM across three model instances, sustained 94% util sits at 82–84 °C indefinitely. No thermal-throttle events yet, but that's uncomfortably close to the SW-slowdown threshold. Standard cooling knobs are absent: bash $ nvidia-smi --query-gpu=power.limit,power.max limit,power.min limit,fan.speed --format=csv,noheader N/A , N/A , N/A , N/A Everything is firmware-managed. nvidia-smi --help still lists -lgc / --lock-gpu-clocks though, and it works — GB10 accepts arbitrary integer MHz values within silicon range even though --query-supported-clocks=graphics returns N/A : bash $ sudo nvidia-smi -lgc 1500,2000 -i 0 GPU clocks set to " gpuClkMin 1500, gpuClkMax 2000 " for GPU 0000000F:01:00.0 All done. Three-band hysteresis, one actuator. Pseudocode: every 30s: read temp.gpu if temp = 78 C: step down 150 MHz , bounded by floor elif temp <= 72 C and cool streak = 3: step up 150 MHz , bounded by ceil else: hold cool streak = cool streak+1 if temp <= 72 else 0 Setpoints, floor 1800 MHz, ceil 3000 MHz GB10 max is ~3003 . At sustained 83 °C it walks the ceiling down in 150 MHz steps every 30 seconds until temp leaves the hot band, then holds. When load drops it relaxes back to the ceiling on a 3-sample cool streak so a brief dip doesn't clock the whole GPU down for the next hour. Same Ollama workload throughout, no config changes to the models or the server: time temp clock util action 07:46:28 82 C 2463 94% STEP DOWN 07:47:28 83 C 2463 94% STEP DOWN 07:47:58 83 C 2463 94% STEP DOWN 07:56:29 76 C 1976 95% HOLD 07:57:29 77 C 1976 96% HOLD 08:13:44 72 C 2093 94% HOLD cool streak 1 08:14:14 72 C 2093 94% HOLD cool streak 2 −11 °C sustained. No throttle events across the window. Latency impact is real but bounded — the floor cap of 1800 MHz vs stock 2463 MHz ≈ 27% worst-case clock reduction, and in practice the daemon rides much higher than that. sudo nvidia-smi -lgc needs passwordless sudo for the daemon user. I scope it in /etc/sudoers.d/ to only -lgc and -rgc .Wrote it up as a licensed install at https://thermal.zctechnologies.org https://thermal.zctechnologies.org — Go daemon, systemd unit, sudoers scoped, per-node monthly. Comment or DM if you'd rather just have the shell recipe; the algorithm above is the whole thing and I'm happy to answer questions about setpoints or the ExecStopPost= nvidia-smi -rgc teardown so a graceful stop returns your GPU to stock clocks.