{"slug": "erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o", "title": "EROFS With Linux 7.2 Better Handles Large Sparse AI Datasets, More Efficient I/O", "summary": "EROFS, the open-source read-only file-system, has been updated in Linux 7.2 with optimized chunk mapping for more efficient I/O and added sparse support for pcluster layout to better handle large sparse AI datasets. The FSCACHE back-end has been removed, with similar functionality now provided via file-backed mounts and fanotify pre-content hooks.", "body_md": "# EROFS With Linux 7.2 Better Handles Large Sparse AI Datasets, More Efficient I/O\n\nThe EROFS open-source read-only file-system has some nice enhancements in place for the\n\nFirst up, EROFS now has optimized mapping of requests for chunk-based inodes. The new EROFS chunk mapping code has been optimized for more efficient I/O performance. There are no performance numbers indicated with the erofs_map_chunks() patch but the I/O performance is simply reported to be more efficient without quantifying.\n\nThe other big EROFS change is that sparsee support has been added to the pcluster layout code. The motivation here is on helping large, sparse AI datasets. Alibaba engineer and EROFS maintainer Gao Xiang explained of the sparse support for pcluster layout in\n\nMeanwhile EROFS previously marked its FSCACHE back-end as deprecated and it's been removed with Linux 7.2. EROFS with FSCACHE was originally intended to provide image lazy pulling functionality. But since FSCACHE later made NETFS a hard dependency, that's what led to EROFS deprecating the feature and now removing it. Similar functionality has since been implemented with file-backed mounts and fanotify pre-content hooks.\n\nMore details on these now-merged EROFS file-system updates for Linux 7.2 via\n\n[Linux 7.2](https://www.phoronix.com/search/Linux+7.2)kernel.First up, EROFS now has optimized mapping of requests for chunk-based inodes. The new EROFS chunk mapping code has been optimized for more efficient I/O performance. There are no performance numbers indicated with the erofs_map_chunks() patch but the I/O performance is simply reported to be more efficient without quantifying.\n\nThe other big EROFS change is that sparsee support has been added to the pcluster layout code. The motivation here is on helping large, sparse AI datasets. Alibaba engineer and EROFS maintainer Gao Xiang explained of the sparse support for pcluster layout in\n\n[the patches](https://lore.kernel.org/all/20260621194414.489939-1-hsiangkao@linux.alibaba.com/):\"Although zeros can be compressed transparently on EROFS using fixed-size output compression so that it is never prioritized in the Android use cases, indicating entire pclusters as holes is still useful to preserve holes in the sparse datasets; otherwise overlayfs will allocate more space when copying up, and SEEK_HOLE won't report any hole.\n\nThis patch introduces two ways to mark a pcluster as a hole.\"\n\nMeanwhile EROFS previously marked its FSCACHE back-end as deprecated and it's been removed with Linux 7.2. EROFS with FSCACHE was originally intended to provide image lazy pulling functionality. But since FSCACHE later made NETFS a hard dependency, that's what led to EROFS deprecating the feature and now removing it. Similar functionality has since been implemented with file-backed mounts and fanotify pre-content hooks.\n\nMore details on these now-merged EROFS file-system updates for Linux 7.2 via", "url": "https://wpnews.pro/news/erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o", "canonical_source": "https://www.phoronix.com/news/EROFS-Sparse-AI-Datasets", "published_at": "2026-06-23 13:42:22+00:00", "updated_at": "2026-06-24 00:02:02.919335+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-infrastructure"], "entities": ["EROFS", "Linux 7.2", "Alibaba", "Gao Xiang", "FSCACHE", "NETFS"], "alternates": {"html": "https://wpnews.pro/news/erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o", "markdown": "https://wpnews.pro/news/erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o.md", "text": "https://wpnews.pro/news/erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o.txt", "jsonld": "https://wpnews.pro/news/erofs-with-linux-7-2-better-handles-large-sparse-ai-datasets-more-efficient-i-o.jsonld"}}