DGX Spark hitting 83 C under sustained Ollama load — solved by clock-locking via nvidia-smi -lgc

wpnews.pro

cd /news/developer-tools/dgx-spark-hitting-83-c-under-sustain… · home › topics › developer-tools › article

[ARTICLE · art-46892] src=dev.to ↗ pub=2026-07-01T15:38Z topic=developer-tools verified=true sentiment=· neutral

DGX Spark hitting 83 C under sustained Ollama load — solved by clock-locking via nvidia-smi -lgc

A developer created a daemon to reduce GPU temperatures on the NVIDIA DGX Spark by clock-locking via nvidia-smi -lgc. The daemon samples temperature every 30 seconds and adjusts clock ceilings, dropping sustained temperatures from 83°C to 72°C under heavy Ollama workloads. The solution addresses the lack of user-exposed power-limit or fan-curve controls on the GB10 GPU.

read2 min views1 publishedJul 1, 2026

TL;DR: GB10 in the DGX Spark has no user-exposed power-limit or fan-curve control (nvidia-smi

returns [N/A]

for both — firmware-managed). But nvidia-smi --lock-gpu-clocks

DOES work. I wrote a tiny daemon that samples temp every 30s and steps the clock ceiling down 150 MHz whenever it enters the warning band, then relaxes it back up after 3 consecutive cool samples. Ollama gpt-oss:120b + qwen2.5:72b workload — dropped from 83 °C → 72 °C, sustained, same util.

My DGX Spark serving Ollama (~40 GB VRAM across three model instances, sustained 94% util) sits at 82–84 °C indefinitely. No thermal-throttle events yet, but that's uncomfortably close to the SW-slowdown threshold. Standard cooling knobs are absent:

$ nvidia-smi --query-gpu=power.limit,power.max_limit,power.min_limit,fan.speed --format=csv,noheader
[N/A], [N/A], [N/A], [N/A]

Everything is firmware-managed. nvidia-smi --help

still lists -lgc

/ --lock-gpu-clocks

though, and it works — GB10 accepts arbitrary integer MHz values within silicon range even though --query-supported-clocks=graphics

returns [N/A]

$ sudo nvidia-smi -lgc 1500,2000 -i 0
GPU clocks set to "(gpuClkMin 1500, gpuClkMax 2000)" for GPU 0000000F:01:00.0
All done.

Three-band hysteresis, one actuator. Pseudocode:

every 30s:
  read temp.gpu
  if temp >= 78 C:              step_down(150 MHz), bounded by floor
  elif temp <= 72 C and cool_streak >= 3:  step_up(150 MHz), bounded by ceil
  else:                         hold
  cool_streak = cool_streak+1 if temp <= 72 else 0

Setpoints, floor 1800 MHz, ceil 3000 MHz (GB10 max is ~3003). At sustained 83 °C it walks the ceiling down in 150 MHz steps every 30 seconds until temp leaves the hot band, then holds. When load drops it relaxes back to the ceiling on a 3-sample cool streak so a brief dip doesn't clock the whole GPU down for the next hour.

Same Ollama workload throughout, no config changes to the models or the server:

time      temp   clock  util   action
07:46:28  82 C   2463   94%    STEP_DOWN
07:47:28  83 C   2463   94%    STEP_DOWN
07:47:58  83 C   2463   94%    STEP_DOWN
07:56:29  76 C   1976   95%    HOLD
07:57:29  77 C   1976   96%    HOLD
08:13:44  72 C   2093   94%    HOLD (cool streak 1)
08:14:14  72 C   2093   94%    HOLD (cool streak 2)

−11 °C sustained. No throttle events across the window. Latency impact is real but bounded — the floor cap of 1800 MHz vs stock 2463 MHz ≈ 27% worst-case clock reduction, and in practice the daemon rides much higher than that.

sudo nvidia-smi -lgc

needs passwordless sudo for the daemon user. I scope it in /etc/sudoers.d/

to only -lgc *

and -rgc

.Wrote it up as a licensed install at https://thermal.zctechnologies.org — Go daemon, systemd unit, sudoers scoped, per-node monthly. Comment or DM if you'd rather just have the shell recipe; the algorithm above is the whole thing and I'm happy to answer questions about setpoints or the ExecStopPost=nvidia-smi -rgc

teardown so a graceful stop returns your GPU to stock clocks.

source & further reading

dev.to — original article Making RAG admit when it's guessing: source-grounded hallucination checks The End of AI "Slop"? How Google is Using LoRA and LLMs to Fight Coordinated Synthetic Spam Most "funded" bounty issues are already dead. I built a CLI to check before you waste an hour.

~/api · this article 200

$curl api.wpnews.pro/v1/news/dgx-spark-hitting-83-c-u…

Read original on dev.to → dev.to/deal_estate_715bf4569d373/dgx-spark-hitti…

mentioned entities

NVIDIA

DGX Spark

GB10

Ollama

nvidia-smi

zctechnologies.org

metadata

slugdgx-spark-hitting-83-c-under-sustained-ollama-load-solved-by-clock-locking-via

topic#developer-tools

secondary2 topics

sentimentneutral

canonicaldev.to

navigation

← prevAnthropic hid tracking signals i…

next →Graph of Thoughts: when a tree o…

── more in #developer-tools 4 stories · sorted by recency

aimultiple.com · 30 Jun · #developer-tools

DGX Spark vs. Mac Studio and Halo

byteiota.com · 1 Jul · #developer-tools

Qualcomm Buys Modular: What Mojo Means for CUDA

dev.to · 1 Jul · #developer-tools

Ur-Agent-Team: a local-first hub where any AI agent can be the brain

dev.to · 1 Jul · #developer-tools

From Harness Engineering to Evals:

── more on @nvidia 3 stories trending now

wpnews · 30 May · #ai-tools

I was wasting 10 minutes every Claude session. So I built a fix.

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 2 Jun · #ai-products

Microsoft launches Discovery platform for scientific R&D with Ginkgo Bioworks partnership

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required