cd/entity/CUDA· home entities CUDA
grep -l @cuda /news/*.json | wc -l → 42

CUDA

mentions 42 type Organization page 2/3 feed RSS
12:51
2026-05-26
klongpy.org
machine-learning

KlongPy: PyTorch Back End and Autograd

KlongPy now supports a PyTorch backend that enables GPU acceleration and automatic differentiation for gradient-based computations. The torch backend outperforms NumPy by up to 8x on large arrays and …

16:17
2026-05-25
blog.mempko.com
artificial-intelligence

The Open/Closed Problem in AI

At the ninth MLSys conference in Seattle, researchers and industry leaders focused overwhelmingly on improving the efficiency of training and deploying large language models, with specialized hardware…

07:00
2026-05-22
leimao.github.io
machine-learning

PyTorch Triton Kernel Transparent Tracing and Compilation

PyTorch has introduced transparent tracing and compilation for Triton kernels, allowing custom operations to be visible to the compiler for optimization. The framework now supports compiling Triton ke…

20:22
2026-05-21
dev.to
artificial-intelligence

How to Fix CUDA Out of Memory Errors in Stable Diffusion WebUI

The "CUDA out of memory" error in Stable Diffusion WebUI is often caused by configuration issues rather than insufficient GPU hardware, particularly due to PyTorch's memory allocator failing to releas…

01:14
2026-05-20
dev.to
large-language-models

Ollama vs llama.cpp vs vLLM: Which Should You Use in 2026?

This article compares three dominant tools for local LLM inference in 2026: Ollama, llama.cpp, and vLLM. Ollama is recommended for personal, non-technical use due to its ease of setup, while llama.cpp…

18:36
2026-05-13
pytorch.org
machine-learning

PyTorch 2.12 Release Blog

PyTorch 2.12 introduces a new device-agnostic `torch.accelerator.Graph` API that unifies graph capture and replay across CUDA, XPU, and out-of-tree backends. The release delivers up to 100x faster bat…

00:00
2026-05-11
loopholelabs.io
ai-infrastructure

Ollama Doesn't Know Its GPU Is on Another Machine

Ollama, an AI model server, ran on a MacBook with no NVIDIA GPU by using GTAP software to intercept CUDA calls and forward them to a remote DGX Spark workstation with a 128 GB Blackwell GPU. The setup…

20:45
2026-03-14
chipsandcheese.com
artificial-intelligence

Analyzing Nvidia GB10's GPU

The article analyzes Nvidia's GB10 processor, highlighting its integrated GPU (iGPU) based on the Blackwell architecture with 48 Streaming Multiprocessors, which is comparable to an RTX 5070 but with …

16:11
2026-03-13
alexselimov.com
artificial-intelligence

Role of AI in my writing

The author confesses that some previous blog posts contained unedited AI-generated segments, which compromised authenticity and hindered the development of their personal writing style. After reflecti…

11:58
2025-11-06
gist.github.com
developer-tools

nvidia-smi cheat sheet

The **nvidia-smi** (NVIDIA System Management Interface) is a command-line tool for monitoring, managing, and diagnosing NVIDIA GPU devices, providing data on performance, temperature, utilization, pow…

← prev page 2 / 3 next →
// co-occurs with top 8 entities