Problem Statement
Down large AI models (10–100 GB) from Hugging Face Hub is:
Time-consuming for users on slower connections Expensive in terms of bandwidth costs for both HF and users Storage-intensive for Hugging Face infrastructure
Current compression options (gzip, zstd) are not optimized for neural network weights (IEEE-754 float tensors).
Proposed Solution
Integrate bounce compression natively into Hugging Face Hub.
Key Benefits
- 25% average compression on model weights (
.safetensors
, .pt
, .gguf
)
- 1069 MB/s decompression speed — faster than most network connections
- Specialized for ML: byte-shuffle transform optimized for IEEE-754 tensors
- CRC-32 integrity verification built-in
- Zero dependencies: pure Rust, Apache-2.0 license
Benchmark: Safetensors Model Weights (255.5 MB)
| Tool | Compressed Size | Ratio | Decompress Speed | bounce -2 | 218.1 MB | 85.3% | 1069.0 MB/s | | zstd -3 | 235.3 MB | 92.1% | 1121.8 MB/s | | gzip -9 | 235.6 MB | 92.2% | 492.9 MB/s | | brotli -q 5 | 235.1 MB | 92.0% | 212.6 MB/s |
bounce saves 17.2 MB (7% better) than the next best tool while maintaining 5x faster decompression than gzip.
Proposed Integration
CLI
huggingface-cli download model/name --compress bounce
Python SDK
from huggingface_hub import hf_hub_download
path = hf_hub_download(
repo_id="model/name",
filename="model.safetensors",
compression="bounce" # auto-decompress .bnc files
)
ROI for Hugging Face
Storage Savings: 25% reduction across millions of models (1 PB → 250 TB saved) Bandwidth Savings: 25% less egress traffic, significant CDN cost reduction User Experience: Faster downloads worldwide, lower data costs for metered connections
Resources
Open Questions
- Should this be opt-in or automatic for large files?
- Backward compatibility strategy for existing downloads?
- Integration timeline with
huggingface_hub
Python package?
I am happy to collaborate on implementation — bounce is production-ready, well-tested, and designed specifically for this use case.