In a comment to the previous post, Pete Steege asked for some expansion on Alyssa’s comments about belt & suspenders vs catastrophe avoidance. He thought the concept interesting and I agree: it’s central to the scalability of storage.

Alyssa isn’t here to provide that expansion, but I am. So here goes.

Concept
For non-American readers “Belt and suspenders” is an idiomatic expression for a risk averse mentality. Small town bankers and accountants were popularly thought to be so risk averse that they’d wear a belt and suspenders to avoid a socially catastrophic Pants_Fall_Down error condition.

Then our bankers and accountants decided they could be fun too. Sigh.

But there’s a cost to this strategy: when you want your pants down – much more common than belt failure – you have to manage both systems. In a time-critical scenario the delay may also lead to catastrophe.

Each solution carries its own overhead. One storage array or NAS box is wonderful; 10 are a pain. They don’t scale well.

Practice
I’ve had risk-averse clients say “I want to mirror my data across 2 RAID 6 arrays.” Well, that will preserve your data – but is it the best use of your money?

What about a fire in your data center? Operator error? Silent data corruption from the server? RAID 6 is solid – that money could be used elsewhere.

Amazon is using the money elsewhere. They copy data across data centers.

Result
Like Google, Amazon doesn’t use RAID to protect data. Instead Amazon makes several copies of the object and spreads it across 2 or more data centers.

That makes life much simpler – after you’ve paid for the data centers, bandwidth and storage – because your unit of management is not an array but a data center. You’ve scaled your management by scaling what you manage.

Going back to the belts and suspenders analogy: if you moved to zero gravity you don’t need a belt and suspenders because pants no longer fall down.

The StorageMojo take
This issue is central to the question of the scalability of storage and computing. Storage arrays are small scale solutions. Redundant data centers are large scale solutions.

Today’s storage array costs are 90-95% for “protection” and performance while only 5-10% is for capacity. Data center redundancy and simple, software-based replication strategies reverse that trend.

If we accept James Hamilton’s calculation that Internet scale storage costs 1/5th that of enterprise storage, then it also follows that enterprises without multiple data centers won’t be able to achieve high scalability and disaster tolerance. The implications for private clouds are obvious.

More on this later, no doubt.

Courteous comments welcome, of course. The more opinions I see about cloud storage the more I believe that “economies of scale” is an alien concept to many observers – perhaps because there haven’t been many in enterprise data centers.