For everyone who wants to run big AI models at home but can't stomach a $2,000+ GPU rig, the GMKtec EVO-X2 is a genuinely new option. Built on AMD's "Strix Halo" Ryzen AI Max+ 395 with up to 128 GB of unified memory, it's arguably the first sub-$1,500 mini PC that can actually run 70-billion-parameter models locally — no discrete GPU required.
What it is #
A small-form-factor PC pairing a 16-core Zen 5 CPU (to 5.1 GHz), a 40-CU RDNA 3.5 integrated GPU, and a 50-TOPS XDNA 2 NPU — with up to 128 GB of LPDDR5X on a 256-bit bus (~256 GB/s). Plus Wi-Fi 7 and USB4. Pricing runs from about $800 (64 GB) to roughly $1,100–$1,500 (128 GB).
Who should buy it #
This is for the local-AI crowd: developers and hobbyists who want to load 35B–70B models (reviewers even ran Qwen3 235B on the 128 GB unit) without renting cloud GPUs or building a multi-card tower. If that's you, the 128 GB EVO-X2 is the value play.
The honest tradeoff #
It's the same story as the Mac Studio and DGX Spark: huge unified memory lets big models fit, but the ~256 GB/s bandwidth means generation speed is modest (think low-double-digit tokens/sec on the largest models). It's an inference and development box, not a raw-throughput monster.
How it compares #
Versus a DGX Spark you trade CUDA and NVIDIA's stack for a lower price; versus a Mac Studio you give up bandwidth and macOS polish but pay far less for the same memory capacity; versus an RTX 5090 you lose speed but smash through its 16–32 GB VRAM ceiling. On dollars-per-gigabyte-of-model, the EVO-X2 wins.
What owners on Reddit are saying #
The EVO-X2 lives mostly in r/LocalLLaMA and r/MiniPCs, where the audience cares about exactly one thing: can it actually run big models? The most useful owner account is u/Eugr’s "Strix Halo vs DGX Spark — Initial Impressions", written by an AI developer who bought both a GMKtec EVO-X2 (128GB) and NVIDIA’s $4,000 DGX Spark to compare head-to-head:
"Inference-wise, the token generation is nearly identical to Strix Halo… but prompt processing is 2–5x higher [on the Spark]. Strix Halo performance in prompt processing degrades much faster with context." — u/Eugr
The honest read from that thread: for pure token generation the EVO-X2 keeps pace with hardware costing far more — its weakness is prompt processing on long contexts. Owners are also watching the price. As u/b0tbuilder noted, the 128GB EVO-X2 saw a ~$200 price jump within weeks of purchase as memory prices climbed — so the "sub-$1,500" framing is increasingly a moving target. The community consensus matches ours: it’s the cheapest sane way to fit a 70B-class model entirely in fast unified memory, as long as you go in knowing prompt processing — not raw token speed — is the compromise.
The bottom line #
If your goal is running large local models affordably, the 128 GB GMKtec EVO-X2 is the standout value in 2025 — just go in knowing it prioritizes capacity over raw speed. Need CUDA or fast generation? Look at the DGX Spark or a Mac Studio instead.