cd /news/ai-infrastructure/continually-increasing-data-storage-… · home topics ai-infrastructure article
[ARTICLE · art-23886] src=blocksandfiles.com pub= topic=ai-infrastructure verified=true sentiment=↓ negative

Continually increasing data storage capacity is unsustainable

Data storage costs are growing unsustainably as enterprises face a ceaseless increase in data, according to Datadobi co-founder Michael Jack. He argues that organizations cannot keep buying more capacity because storage expenses consume an ever-larger share of budgets, starving other critical operations. Jack calls for a paradigm shift where storage suppliers focus on data lifecycle management rather than simply selling more capacity.

read5 min publishedJun 8, 2026

Never-ending data growth falls on enterprises like rain but they can’t keep building reservoirs.

That’s the message from Michael Jack, co-founder and CRO of data classifier and manager Datadobi. He sees enterprises and other organizations facing a ceaseless increase in the data they store. Conceptually there is no storage problem. It’s all just digital bits and you buy more capacity. That is the problem, a massive and growing data storage expense problem, with other ancillary problems - governance, risk, AI access - coming along behind.

Jack tells me that all organizations have budgets to fund and support their operations. Storage is an expense item on that budget, a slice of the budget pie chart. You cannot keep growing that slice, buy 5 petabytes here, another 5 petabytes there, because the storage budget pie chart slice will grow like a tumour, my term, and prevent budget money being spent on other things equally or more important to the organization.

Nvidia founder and CEO Jensen Huang, Jack tells me, says 90 percent of data is unstructured,. meaning file or object. We don’t have an unstructured data storage problem, Jack says, we have an unstructured data expense problem.

If more and more money is spent on basic bits and bytes storage, house-keeping, then there is less money for the organization to invest in its higher priority needs, its raison d’etre. Storage expense becomes a funding bottleneck cramping and restricting other parts of the business or organization. Envisage data as rain and data growth as a continuing deluge of rain. Water, rain water, has value. We want to keep it, but we can’t keep building reservoirs. There isn’t enough space. The fresh rain is the most valuable. So we have to freeze the rainwater we no longer need, and stack the ice in deep freezes, digital north poles as it were, or deep Glacier archives.

This analogy weakens when we realize that data is not fungible, like water. A drop of water is worth the same as any other drop of water. A byte of data has a different value from another byte of data because, for example, it’s older, or it’s about a sub-item in a cleaning contract versus a headline item in a major product sale, and so forth.

This necessity becomes clearer when we understand that every extra petabyte of data stored represents not just a storage capacity expense increase, but a cyber threat surface increase, an enlarged governance issue, a compliance complexity increase, a data protection burden increase, an AI access filtering and selection increase. Simply buying more and more expensive, on-premises, storage capacity is unsustainable.

In order to move data off fast access, expensive storage, where it most probably initially lands, you have to understand what it is, classify it, measure its access rate and, when the time is right, progressively move it down tiers of less and less expensive storage so that the cost of storing it relates appropriately to its value.

You can only do this effectively if existing data classification is known and incoming data is classified as it is ingested or generated. You need this metadata so that you can make a judgement, probably by using a policy-driven, automated procedure, to continually scan your data estate and move older, less valuable data to cheaper storage, or to faster storage if its value changes and it’s needed afresh.

Every storage supplier knows this, but they are focussed on the next transaction. They want to sell you your more storage. It’s what they do. Jack says this has to change. There has to be a paradigm shift ,where they become data storage lifecycle suppliers, and not just focussed on selling you the next 5 petabytes.

He likens this to the early days of virtualization, when VMware was looked on with suspicion by server vendors. A customer who bought VMware to run ten virtual machines on a single server did not buy ten servers. A VMware sales win was a server vendor sales loss. Eventually though, the ease of running VMs meant that they became so popular server sales increased over and over as server vendors took the long view. VMware and its competitors ended up helping server sales. It was an example of Jevons paradox.

So too, Jack thinks, will dynamic data classification, tiering, risk analysis, sovereignty understanding, governance and resilience. Storage suppliers will have a stronger and better relationship with their customers if they enable these things rather than having a shorter term, next transaction view of their customers.

He doesn’t think that the large consultancies, like Gartner, Arthur Anderson or Price Waterhouse, yet understand how things are changing, how the endless buy-more-capacity mindset must end.

Datadobi’s experience at Manchester University, where a 10 PB PowerScale purchase was avoided by moving stale data to the public cloud, made a profound impression on Jack. But he disagrees with our conclusion that Dell's loss was Datadobi's gain.

This was true in an immediate sales transaction sense but, he says, it’s not really a zero-sum, win-loss game. Taking a longer view, storage capacity suppliers will have a stronger, more enduring relationship with customers if they embrace a data lifecycle, changing data value, viewpoint and help customers get their data classified so that storage cost can reflect data’s changing value to the organisation.

A new category of software is needed; software which is the intelligence and orchestration layer for unstructured data helping organizations discover, align and operationalize data across fragmented enterprise environments. That’s how Datadobi now sees its role in our storage universe.

── more in #ai-infrastructure 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/continually-increasi…] indexed:0 read:5min 2026-06-08 ·