cd/entity/GPU· home› entities› GPU

grep -l @gpu /news/*.json | wc -l → 87

GPU

mentions 87 type Organization page 2/5 feed RSS

// recent coverage 87 mentions

22:50

2026-06-26

dev.to

artificial-intelligence

Why AI Clusters Fail Even When GPUs Are Idle

AI clusters often underperform despite powerful GPUs because the GPUs are idle due to bottlenecks in data loading, CPU preprocessing, network communication, or storage contention. A developer explains…

21:34

2026-06-26

news.ycombinator.com

ai-infrastructure

Ask HN: Can distributed data centers in individual households provide UBI?

A Hacker News user proposes that AI companies install GPU clusters in individual households and pay residents hundreds or thousands of dollars monthly, framing the idea as a potential source of univer…

00:18

2026-06-26

extropic.ai

ai-infrastructure

Thermodynamic Computing from Zero to One

Extropic unveiled thermodynamic computing hardware and algorithms that run generative AI workloads using radically less energy than GPUs. The company released its `thrml` library and plans to build a …

23:03

2026-06-25

devclubhouse.com

generative-ai

Physics as Code: Inside Un-0's Oscillator-Based Image Generation

Unconventional AI released Un-0, an image generator that uses simulated coupled oscillators instead of neural network layers, achieving an FID of 6.74 on ImageNet 64x64. The model validates that physi…

17:31

2026-06-25

pub.towardsai.net

artificial-intelligence

200x Faster RedTensor Engine: Red Alice Benchmarking #1

Red Alice AI released the first official benchmark of its Version 2 architecture, reporting a 200x performance gain in the RedTensor engine. The upgrade introduces a PyTorch-backed TorchTensor backend…

16:50

2026-06-25

cryptobriefing.com

ai-infrastructure

Micron Technology posts record $41B revenue as AI memory demand rewrites the semiconductor playbook

Micron Technology reported record revenue of $41.46 billion for the quarter ending June 24, 2026, a 346% year-over-year surge driven by AI demand for high-bandwidth memory, which is sold out through 2…

16:31

2026-06-25

pub.towardsai.net

large-language-models

Prefill/Decode Disaggregation: Why Your GPU Can’t Do Two Things at Once

Prefill/decode disaggregation separates the two phases of LLM inference—prefill (compute-bound) and decode (memory-bound)—onto different GPUs to avoid the performance compromise of running both on the…

16:00

2026-06-25

newsroom.arm.com

artificial-intelligence

From host node to heterogeneous rack: Rethinking the AI CPU

AI infrastructure is entering a new phase focused on rack-scale system composition for agentic AI workflows, where CPUs play critical orchestration roles alongside accelerators. The shift from single-…

07:35

2026-06-25

pub.towardsai.net

machine-learning

Deep Learning Inference: PyTorch, ONNX, and TensorRT Explained

A developer built a custom Inference Optimization Engine on an NVIDIA RTX 4050 GPU to analyze how PyTorch, ONNX, and TensorRT interact with hardware, revealing that model deployment and optimization c…

10:00

2026-06-24

mercurynews.com

artificial-intelligence

Opinion: America’s lead in AI is now at the mercy of local zoning boards

A Gallup poll shows over 70% of Americans oppose AI data centers near their homes, threatening U.S. lead in AI against China. Local zoning battles and legitimate concerns over utility costs, water use…

14:35

2026-06-23

fortran-lang.discourse.group

developer-tools

AI Coding Assistants vs. Codee — Insights on Fortran Correctness and Modernization

A developer refactored Fortran code for GPU acceleration by replacing inline PPM slope computations with pure subroutines callable from do concurrent loops, improving code clarity and enabling better …

15:00

2026-06-22

hiraditya.github.io

large-language-models

Crossing the Boundary: Custom Kernels and the C++/Python ABI in vLLM

VLLM, a large-model inference serving framework, uses Python for control flow but pushes arithmetic into compiled C++ and CUDA kernels to avoid interpreter overhead. The Python/C++ boundary crossing i…

03:28

2026-06-22

dev.to

developer-tools

Benchmark Rust, Go và TypeScript: NPU 50 TOPS hay RTX 5060?

A developer benchmarked compile performance of Rust, Go, and TypeScript on a medium-sized project, finding Rust's cold build takes 3-5 minutes with high CPU usage, Go's build completes in under 10 sec…

19:04

2026-06-21

devclubhouse.com

ai-chips

TPU vs GPU: The Architecture and Software Trade-offs

Google's TPU uses a systolic array architecture optimized for tensor algebra, offering higher throughput and energy efficiency than GPUs for dense matrix operations, but requires XLA compilation and i…

16:44

2026-06-21

dev.to

artificial-intelligence

TPUs vs GPUs: How Google's Tensor Processing Units Actually Work

Google's Tensor Processing Units (TPUs) are specialized chips designed for neural network matrix multiplications, differing fundamentally from GPUs. Unlike GPUs, which evolved from graphics rendering,…

18:06

2026-06-20

dev.to

machine-learning

Neural Networks with PyTorch and Lightning AI Part 5: Final Results and GPU Acceleration

A developer using PyTorch and Lightning AI demonstrated automated neural network training with GPU acceleration. By setting the trainer's accelerator and devices to 'auto', Lightning automatically det…

14:39

2026-06-19

letsdatascience.com

large-language-models

DigitalOcean Demonstrates LLM Compression with SparseGPT

DigitalOcean published a tutorial on June 19 demonstrating how to compress large language models using SparseGPT and Wanda pruning methods for GPU cloud deployment, targeting reduced inference costs a…

00:18

2026-06-19

dev.to

ai-agents

How I Run a 50-Agent AI Workforce on a Single 6GB GPU

A developer describes running ~50 local AI agents on a single 6GB GPU by using a lock-based queue, an eviction monitor, a resource governor, and a model router. The system serializes GPU access so onl…

00:00

2026-06-19

fergusfinn.com

ai-infrastructure

InfiniBand, RoCE, and all that

InfiniBand, a high-performance interconnect technology designed for Remote Direct Memory Access (RDMA), has become critical for AI training and inference workloads that require direct data movement be…

14:34

2026-06-18

cryptobriefing.com

ai-chips

Micron reports earnings on June 24, expects record revenue growth driven by AI memory demand

Micron Technology is set to report fiscal Q3 2026 earnings on June 24, projecting record revenue of $33.5B and a non-GAAP gross margin of 81%, more than double from a year ago, driven by surging deman…

← prev page 2 / 5 next →

// co-occurs with top 8 entities

PyTorch 11 LLM 9 CPU 8 CUDA 7 vLLM 6 NVIDIA 6 HBM 6 TPU 5