Speeding Up AI: Bringing Google Colossus to PyTorch via GCSFS and Rapid Bucket

wpnews.pro

cd /news/artificial-intelligence/speeding-up-ai-bringing-google-colos… · home › topics › artificial-intelligence › article

[ARTICLE · art-1953] src=developers.googleblog.com ↗ pub=2026-05-20T03:12Z topic=artificial-intelligence verified=true sentiment=↑ positive

Speeding Up AI: Bringing Google Colossus to PyTorch via GCSFS and Rapid Bucket

Google Cloud announced a performance boost for AI/ML workloads on PyTorch by integrating its Colossus-powered Rapid Storage directly via the fsspec interface and gcsfs. The new Rapid Bucket solution bypasses legacy REST APIs using persistent gRPC bidirectional streams, achieving up to 4.8x faster reads and 2.8x faster writes, resulting in a 23% reduction in total training time for a benchmark workload. Developers can leverage these gains without code changes by simply switching to a Rapid Bucket and updating the `gcsfs` library.

read3 min views23 publishedMay 20, 2026

Today, we are announcing a major performance boost for AI/ML workloads using the PyTorch ecosystem on Google Cloud. By integrating Rapid Storage, powered by Google’s Colossus storage architecture, directly with PyTorch via the industry-standard fsspec interface, we are enabling researchers and developers to keep their GPUs busier than ever before. As model sizes grow, data and checkpointing often become the primary bottlenecks in training. Data preparation activities to train models involve fetching and processing terabytes and petabytes of data from remote storage mechanisms like object storage. Standard REST-based storage access can struggle to meet the extreme throughput and low-latency requirements of modern distributed training, wasting valuable GPU resources. Our new Rapid Bucket solution provides high-performance object storage in dedicated zonal buckets. By bypassing legacy REST APIs and utilizing persistent gRPC bidirectional streams, we’ve brought the power of Colossus, filesystem stateful protocols that power YouTube and Google Search, directly to the PyTorch ecosystem. fsspec is the pervasive Pythonic interface for file systems in the PyTorch ecosystem. It is already used for: There are various backend implementations of fsspec for many different storage systems, which can all be integrated under a single layer, eliminating the need to write specific code for each backend. By integrating Rapid Storage with gcsfs (the Google Cloud Storage implementation of fsspec), developers can leverage speed gains provided by Rapid with a simple fsspec.open() call — no complex code rewrites required. To achieve a performance boost with Rapid Buckets, we optimized the entire data path: us-central1-a ), we eliminate cross-zone latency. Prior to Rapid buckets, data in a regional bucket and compute(accelerators) can be in different zones and access the data induced latency.fsspec API while entirely upgrading internal traffic from HTTP to BiDi-gRPC for Rapid buckets. By adding bucket-type auto-detection to gcsfs, PyTorch and other fsspec clients transparently utilize Rapid with zero manual configuration.A dataset of 134M rows totaling around 451GB was loaded onto 16 GKE nodes, each containing eight A4 GPUs. Training was conducted in 100 steps, with a checkpoint after every 25 steps using PyTorch Lightning. We benchmarked the performance of total training time, including the data load times, and we observed a performance gain of 23% using Rapid Bucket compared with Standard regional bucket. Microbenchmarking — that is, measuring the performance of a building block like I/O or resource usage — confirms these gains. Throughput improved by 4.8x for reads (both sequential and random) and 2.8x for writes. These tests used 16MB IO sizes across 48 processes. You can find more details at GCSFS-performance-benchmarks. Getting started with GCSFS on Rapid Bucket is easy. Your existing code and scripts remain the same. You just need to change the bucket to a Rapid Bucket to take advantage of the performance boost. To install: Rapid Bucket integration is available from version 2026.3.0. pip install gcsfs Code sample to read/write from GCS Rapid:

import gcsfs
fs = gcsfs.GCSFileSystem()
with fs.open('my-zonal-rapid-bucket/data/checkpoint.pt', 'wb') as f:
f.write(b"model data...")
with fs.open('my-zonal-rapid-bucket/data/checkpoint.pt', 'ab') as f:
f.write(b"appended data...")

source & further reading

developers.googleblog.com — original article Run Ray on TPU, Part 2: Ray AI libraries Scaling Agentic RL: High-Throughput Agentic Training with Tunix Build intelligent Android apps: Cloud and hybrid inference

~/api · this article 200

$curl api.wpnews.pro/v1/news/speeding-up-ai-bringing-…

Read original on developers.googleblog.com → developers.googleblog.com/speeding-up-ai-bringin…

mentioned entities

Google Cloud

PyTorch

Colossus

Rapid Storage

Rapid Bucket

gcsfs

fsspec

Google

metadata

slugspeeding-up-ai-bringing-google-colossus-to-pytorch-via-gcsfs-and-rapid-bucket

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicaldevelopers.googleblog.com

navigation

← prevBuilding with Gemini Embedding 2…

next →Building real-world on-device AI…

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 29 Jul · #artificial-intelligence

Build a Text Summarizer with Hugging Face Transformers

blog.jetbrains.com · 29 Jul · #artificial-intelligence

Pytorch Tutorial for Deep Learning

dev.to · 24 May · #artificial-intelligence

Multi-BU D365 environment: single tenant, multiple LEs

dev.to · 24 May · #artificial-intelligence

Comunicación y sincronización entre procesos distribuidos

── more on @google cloud 3 stories trending now

wpnews · 28 Jul · #large-language-models

How to Download and Run Kimi K3 Open Weights

wpnews · 16 Jul · #artificial-intelligence

Women entrepreneurs are less likely to leverage AI—but more likely to benefit from it

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required