The assumption that underlies much of the interest in cloud computing is that there are economies of scale. If there are not, the extra costs of bandwidth and latency will make cloud computing too costly.

Ever since Google demonstrated that massive infrastructures could be built from commodity hardware and open source software system architects have sought similar advantages at lesser scale. People tend to ignore the fact that Google’s infrastructure is optimized for a few very specific applications.

The Google filesystem and the Google storage system, BigTable, are designed to handle the massive amounts of data that Google acquires and searches every day. Each Google rack only contains 120 disk drives, which is low density compared to most commodity servers.

Google has shown us a way to build massive infrastructures, but not the way. They have built a warehouse sized search appliance.

What makes storage cheaper?
Here is a list:

  • Commodity drives. Cheap drives make for a cheap storage.
  • Wide fan out. Amortizing interconnect costs across more drives will further lower costs. Performance may suffer depending on workload.
  • Free software. Linux, openSolaris, Hadoop and other products are among the candidates.
  • Low cost networking. Unmanaged Ethernet switches.
  • Self management. When the rest of the infrastructure is either cheap or free people costs will rapidly become the dominant factor.
  • Low entry cost. Cloud storage has a definite advantage. Faster setup and lower capital costs are tangible benefits.

Other than fan out none of these factors are very sensitive to scale. Of course there are other issues: network costs; data center costs; and power costs.

Where are the economies?
But once you get above a dozen of so racks what other economies of storage scale are there? I’m asking the question so feel free to provide answers.

The StorageMojo take
People may be the most important economy of scale in storage. If one infrastructure requires 1 admin for 100 TB and another only 1 for 500 TB it is obvious who will win, at least in the United States.

This suggests that cloud storage will need unique services to win. Online backup is an example of a service where users are buying more than capacity.

Then the problem becomes, at least for consumer services, designing offers that are attractive enough to get consumers to sign up and are profitable for the provider. And that means a marriage of marketing, finance and technology. Competing purely on price will be a fool’s game.

Courteous comments welcome, of course.