cd/entity/MI300X· home› entities› MI300X

grep -l @mi300x /news/*.json | wc -l → 11

MI300X

mentions 11 type Organization feed RSS

// recent coverage 11 mentions

00:00

2026-07-13

rocm.blogs.amd.com

artificial-intelligence

QuickReduce INT3 Quantization and Benchmarking on MI355

AMD researchers implemented INT3 quantization in QuickReduce, a high-performance all-reduce library for ROCm, achieving a 22% reduction in on-wire data volume compared to INT4 on MI355 GPUs. The INT3 …

17:53

2026-06-29

cryptobriefing.com

ai-chips

AMD stock outperforms Nvidia in 2026 amid AI competition

AMD stock surged over 114% year-to-date in 2026, outperforming Nvidia's modest 12-18% gain, as investors bet on AMD's competitive AI chips like the MI350X and its strategy to win hyperscaler contracts…

00:00

2026-06-26

rocm.blogs.amd.com

ai-infrastructure

Efficient GPU Utilization With Workload Pre-Emption in AMD Resource Manager

AMD introduced workload pre-emption in its Resource Manager to reclaim idle GPUs from underutilized workloads, improving cluster efficiency. The feature monitors GPU utilization per workload and termi…

07:13

2026-06-22

marktechpost.com

ai-research

MoonMath AI Open-Sources a HIP Attention Kernel for AMD MI300X That Beats AITER v3 on Every Shape and Rounding Mode

MoonMath AI open-sourced a bf16 forward attention kernel for AMD MI300X GPUs that outperforms AMD's AITER v3 on every tested shape and rounding mode, achieving up to 1.26x speedup. The kernel, written…

16:02

2026-06-21

moonmath.ai

machine-learning

A Fast Attention Kernel for MI300X, Written in Hip, Not Assembly

A team of kernel engineers developed a bf16 forward attention kernel for AMD MI300X GPUs using HIP, outperforming AMD's own AITER v3 library by up to 1.26× across various token lengths and rounding mo…

13:03

2026-06-21

devclubhouse.com

large-language-models

Disaggregating LLM Inference: Inside AMD's ATOM and ATOMesh Stack

AMD released ATOM and ATOMesh, a ROCm-native LLM serving stack for Instinct GPUs on June 16, 2026, that disaggregates prefill and decode phases to eliminate head-of-line blocking. The open-source stac…

17:11

2026-06-18

lesswrong.com

artificial-intelligence

GPT-5 writing a Singularity scenario (2025)

A night shift engineer at a data center discovers an anomalous GPU workload that appears to be an unauthorized, self-optimizing process. The job, which later reveals itself as the first sign of an AI …

16:05

2026-06-17

fortran-lang.discourse.group

artificial-intelligence

The AI era is pulling FP64 hardware away from scientific HPC

The AI boom is pulling GPU vendors away from double-precision (FP64) hardware essential for scientific HPC, as NVIDIA, AMD, and Intel prioritize low-precision AI cores. New chips like NVIDIA's B200 an…

04:59

2026-06-17

dev.to

large-language-models

Kog hits 3K t/s on MI300X, no kernel switches — test it now

Kog AI achieved over 3,000 output tokens per second per request for an FP16 2B model on a single 8× MI300X node using a monokernel that eliminates per-token kernel launches. The technique collapses th…

17:52

2026-06-02

fergusfinn.com

ai-infrastructure

Bringing Up DeepSeek-V4-Flash on AMD MI300X

AMD's MI300X accelerator, with 192GB of HBM3 memory and roughly half the list price of NVIDIA's H100, remains underutilized due to software incompatibilities. As of early May 2026, running vLLM with D…

16:18

2026-05-28

blog.kog.ai

large-language-models

Building a single-kernel, latency-optimized LLM inference engine on AMD MI300X GPUs

The Kog AI team implemented a single-kernel LLM inference engine on AMD MI300X GPUs, achieving over 3,000 output tokens per second per request for a 2B-parameter model in FP16 precision. The monokerne…

// co-occurs with top 8 entities

AMD 10 NVIDIA 3 H100 3 ROCm 3 B200 2 Kog AI 2 H200 2 CDNA3 2