768GB Intel Optane DIMMs to run 1T-parameter LLM with single GPU at 4tps

wpnews.pro

cd /news/large-language-models/768gb-intel-optane-dimms-to-run-1t-p… · home › topics › large-language-models › article

[ARTICLE · art-18758] src=tomshardware.com ↗ pub=2026-05-30T20:17Z topic=large-language-models verified=true sentiment=· neutral

768GB Intel Optane DIMMs to run 1T-parameter LLM with single GPU at 4tps

A Redditor running a workstation with 768GB of used Intel Optane Persistent Memory DIMMs achieved roughly 4 tokens per second on a 1-trillion-parameter LLM (Kimi K2.5) using a single GPU. The six second-hand Optane sticks, acquired cheaply, offer lower latency than NVMe SSDs but remain slower than DRAM, enabling local inference of massive models at a fraction of the cost of equivalent DRAM. The feat highlights the viability of discontinued Optane memory for large-scale AI workloads, though its scarcity limits the approach to an exotic, non-scalable solution.

read1 min views20 publishedMay 30, 2026

768GB of cheap Intel Optane DIMM memory sticks used to run 1-trillion-parameter LLM on a system with a single GPU — local Kimi K2.5 install achieved roughly 4 tokens per second

A Redditor has caused a stir by coaxing a workstation build using Optane PMem DIMMs as RAM to run a 1-trillion-parameter LLM. APFrisco explains in a mini tutorial/guide on the Local LLaMA subreddit how they bought some used Intel Optane Persistent Memory, acquired relatively cheaply second-hand, to “run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second” on their Xeon workstation.

Central to the headlining feat was the Redditor’s sourcing of six Optane PMem (DCPMM) sticks. The discontinued memory format was designed to bridge the DRAM-SSD divide. While the 768GB of Optane (6x 128GB) does indeed offer far lower latency than the best NVMe SSDs, it is still two or three times slower than DRAM. These characteristics are still rather sweet for LLM inference frameworks, and the second-hand price was “much less than what the equivalent DRAM capacity would cost.” But, alas, Optane is dead, so this is an exotic solution.

source & further reading

tomshardware.com — original article Apple's rumored M7 Ultra targets 1.5TB and Blackwell-class AI performance China will have a Fable 5-class AI model before next year Police accused of misusing AI license-plate tracking systems

~/api · this article 200

$curl api.wpnews.pro/v1/news/768gb-intel-optane-dimms…

Read original on tomshardware.com → www.tomshardware.com/tech-industry/artificial-in…

mentioned entities

Intel

Optane

Kimi K2.5

APFrisco

Local LLaMA

metadata

slug768gb-intel-optane-dimms-to-run-1t-parameter-llm-with-single-gpu-at-4tps

topic#large-language-models

secondary3 topics

sentimentneutral

canonicaltomshardware.com

navigation

← prevFortunes of Anthropic's Seven Co…

next →How I bypassed Vercel Serverless…

── more in #large-language-models 4 stories · sorted by recency

gizmodo.com · 14 Jul · #large-language-models

Kalshi Wants to Predict the Future of Compute Availability

cryptobriefing.com · 15 Jul · #large-language-models

Coinbase says 95% to 100% of its code is now AI-assisted, up from 40% in February

byteiota.com · 15 Jul · #large-language-models

Bonsai 27B Runs on iPhone: The On-Device AI Tradeoffs

reuters.com · 15 Jul · #large-language-models

CoreWeave explores Wall Street playbook to hedge memory-chip price risk

── more on @intel 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required