{"slug": "storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity", "title": "Storage Insights datasets: Enabling org-wide operational discovery with activity insights", "summary": "Google Cloud announced the general availability of activity insights within Storage Insights datasets, providing administrators with operational visibility into how storage objects are accessed, moved, and modified. The new views capture object-level activity, bucket-level regional traffic, and project-level aggregates, enabling data-driven cost optimization and faster troubleshooting across Google Cloud Storage estates.", "body_md": "As enterprise storage footprints scale to billions of objects, AI applications and agentic workloads are fundamentally shifting the role of storage from a passive repository to the foundation of the data platform. This is driven by a surge in unstructured model data and the billions of actions performed on those objects, including session logs and audit trails. To manage this and answer questions about cost, operations, and security, storage and platform admins need to go beyond knowing what data they have, to understanding exactly how it is being accessed, moved, and modified.\n\nTo help, we're excited to announce activity insights within [Storage Insights datasets](https://cloud.google.com/storage/docs/insights/datasets). Now generally available, these new views provide visibility into the operational details of your Google Cloud Storage assets, enabling data-driven cost optimization and faster troubleshooting. For example, with activity insights, you can answer questions like:\n\nAre my objects located in the right storage classes within my buckets?\n\nWhat regions is my bucket interacting with the most so I can assess if it is optimally located?\n\nWhere are there errors across operations on my storage estate and why?\n\nAnswering these questions confidently is the key to unlocking cost optimizations and reclaiming engineering time. Storage Insights datasets, a feature of [Storage Intelligence](https://cloud.google.com/storage/docs/storage-intelligence/overview) for Cloud Storage, provides daily metadata and frequent activity insights (typically within four hours of the activity) so you have better visibility into your storage estate. While Storage Intelligence is a unified management product with capabilities like [Bucket relocation](https://cloud.google.com/storage/docs/bucket-relocation/overview), [Batch operations](https://cloud.google.com/storage/docs/batch-operations/overview) and [Gemini Cloud Assist](https://cloud.google.com/storage/docs/analyze-data-gemini-cloud-assist), this blog focuses on how you can leverage Storage Insights datasets for operational optimization.\n\nStorage Insights datasets deliver an automated, query-ready BigQuery index of your entire storage estate, complete with raw metadata and activity insights, replacing manual, error-prone data collection. Storage Insights datasets can be customized in scope: create a dataset for your entire org, a specific folder, a project, or a set of projects, or even specific buckets. The dataset then refreshes with regular updates, giving you a comprehensive view of your storage.\n\nStorage Insights datasets are your go-to tool for understanding your storage metadata, acting as an inventory management tool, scanning object metadata (storage class, location, age, custom metadata) and organizing it into a powerful, queryable BigQuery-linked dataset. This is crucial for knowing **what** data you have (learn more about how to optimize storage spend with Storage Insights datasets [here](https://cloud.google.com/blog/products/storage-data-transfer/storage-insights-datasets-optimizes-storage-footprint?e=48754805)).\n\nBut what if you also knew **how and when** that data is being used?\n\nStorage Insights datasets now offers a set of new views that capture:\n\n**Object-level activity,** including writes, updates, deletes, and errors\n\n**Bucket-level aggregate activity,** including total object operations, a breakdown by type of operations, total errors and most active prefixes\n\n**Bucket-level regional traffic activity,** including ingress and egress bytes per region that interact with your bucket\n\n**Project-level aggregate activity,** including total object operations, a breakdown by type of operations and total errors\n\nThis data flows directly into new BigQuery views within your dataset so you can run analytics queries for specific insights, interact with the data via [Gemini](https://docs.cloud.google.com/bigquery/docs/gemini-overview) or simply connect it to powerful [Looker dashboards](https://bit.ly/si-template) for visualization.\n\nThis moves you from a static snapshot to a dynamic, queryable analysis of your data's entire lifecycle. It's the difference between knowing what's in your warehouse and knowing what’s used and when.\n\nHere’s what you can do, starting today, with activity insights in Storage Intelligence datasets.\n\n**The challenge:** You have terabytes of data in Standard or Nearline class storage that you believe is cold. But without proof, moving it to Coldline or Archive class is risky. What if a critical process still needs to read it once per quarter?\n\n**The solution:** With the new Storage Intelligence views that surface activity insights, you can now identify buckets that have had minimal read/write activity over the last 30, 60, or 90 days.\n\n**The outcome:** Apply or fine-tune lifecycle policies to transition this data to more cost-effective storage classes.\n\nFor example, here’s a SQL query to order all the buckets in your estate with little to no activity in the last six months:\n\n**The challenge:** Your team set up a multi-region bucket to serve a global application. But a year later, is that still the right architecture? What if 99% of your traffic is now coming from a single region?\n\n**The solution:** Analyze the access patterns in your new bucket_region_activity_view table. You can easily pinpoint which regions are driving read and write activity for the bucket.\n\n**The outcome:** Make data-driven decisions to co-locate your bucket with your compute. You might find that changing a multi-region bucket to a single-region one (or vice-versa) can lead to significant cost-savings and even improve performance.\n\nFor example, here’s a SQL query to break down the egress and ingress traffic pattern for a bucket across regions:\n\n**Shipt**, a retail technology platform and same-day delivery service, has been using Storage Intelligence capabilities to inform their data location decisions:\n\n“Storage Intelligence enables us to efficiently manage over 2 billion objects, delivering cost and performance optimization. With Insights datasets, we detected and analyzed egress charges from multi-region buckets, identifying opportunities to improve efficiency by co-locating compute and storage. By leveraging the Bucket Relocate capability, we seamlessly moved 1.3 Petabytes of data from multi-region to regional storage, achieving substantial cost savings while maintaining uninterrupted application performance and data pipeline continuity.” **-** Ron Cuirle, Director of Engineering - Cloud Platforms, Shipt\n\n**The challenge:** Your team sees a spike in 429 (too many requests) errors. In a massive environment, this is rarely just a performance hiccup — it’s expensive! These errors trigger automatic retries, which often lead to a cycle of high-frequency, billable operations that drive up your Class A costs. Pinpointing exactly which object or prefix is causing this can be a time-consuming troubleshooting nightmare.\n\n**The solution:** The new Storage Insights datasets views provide granular details on these errors, right in BigQuery. You can query for 429 errors and see exactly which objects and prefixes are under pressure.\n\n**The outcome:** Additionally, you can pinpoint the cause of your 429 errors, moving your team from troubleshooting to resolution.\n\nFor example, here’s a SQL query to analyze 429s occurring across your estate, where they are happening and why:\n\nAs your organization grows with Google Cloud, the scale of your data will only increase. Stop relying on archival data and start optimizing your organization’s storage estate. Cloud Storage Storage Insights datasets with activity insights turn massive data estates from complex operational challenges into clearly understood, highly optimized assets.\n\nTo get started, check out use our pre-configured Looker Studio template [here](https://lookerstudio.google.com/c/u/0/reporting/670eee3f-ad6d-45ea-a169-853ab023dc84/page/p_k94oydxikd) to connect to your dataset for quick analysis and value:\n\nFor example: View the trend for Total Reads on your bucket over time\n\nOr, analyze the ingress and egress traffic patterns for your bucket:\n\nReady to turn insight into action?\n\nEnable [Storage Intelligence](https://docs.cloud.google.com/storage/docs/storage-intelligence/overview) today in the Google Cloud console.\n\n[Configure your dataset today](https://docs.cloud.google.com/storage/docs/insights/datasets) and query your data directly in BigQuery or [connect to our Looker template](https://bit.ly/si-template) for quick and easy visualization.\n\nReference our videos for more information on [Storage Intelligence](https://youtu.be/3makK6m8sIw?si=-BjdpU2ErtZGXwSA) and [How to Get Started](https://youtu.be/r5Z_z1bgcw0?si=mkFxaY939Tkq9p6A).", "url": "https://wpnews.pro/news/storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity", "canonical_source": "https://cloud.google.com/blog/products/storage-data-transfer/analyze-cloud-storage-estates-with-storage-insights-datasets/", "published_at": "2026-06-09 16:00:00+00:00", "updated_at": "2026-06-11 17:18:01.339590+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-products", "ai-tools", "artificial-intelligence", "machine-learning"], "entities": ["Google Cloud Storage", "Storage Insights datasets", "Storage Intelligence"], "alternates": {"html": "https://wpnews.pro/news/storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity", "markdown": "https://wpnews.pro/news/storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity.md", "text": "https://wpnews.pro/news/storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity.txt", "jsonld": "https://wpnews.pro/news/storage-insights-datasets-enabling-org-wide-operational-discovery-with-activity.jsonld"}}