In a comment to the previous post, Pete Steege asked for some expansion on Alyssa’s comments about belt & suspenders vs catastrophe avoidance. He thought the concept interesting and I agree: it’s central to the scalability of storage.
Alyssa isn’t here to provide that expansion, but I am. So here goes.
Concept
For non-American readers “Belt and suspenders” is an idiomatic expression for a risk averse mentality. Small town bankers and accountants were popularly thought to be so risk averse that they’d wear a belt and suspenders to avoid a socially catastrophic Pants_Fall_Down error condition.
Then our bankers and accountants decided they could be fun too. Sigh.
But there’s a cost to this strategy: when you want your pants down – much more common than belt failure – you have to manage both systems. In a time-critical scenario the delay may also lead to catastrophe.
Each solution carries its own overhead. One storage array or NAS box is wonderful; 10 are a pain. They don’t scale well.
Practice
I’ve had risk-averse clients say “I want to mirror my data across 2 RAID 6 arrays.” Well, that will preserve your data – but is it the best use of your money?
What about a fire in your data center? Operator error? Silent data corruption from the server? RAID 6 is solid – that money could be used elsewhere.
Amazon is using the money elsewhere. They copy data across data centers.
Result
Like Google, Amazon doesn’t use RAID to protect data. Instead Amazon makes several copies of the object and spreads it across 2 or more data centers.
That makes life much simpler – after you’ve paid for the data centers, bandwidth and storage – because your unit of management is not an array but a data center. You’ve scaled your management by scaling what you manage.
Going back to the belts and suspenders analogy: if you moved to zero gravity you don’t need a belt and suspenders because pants no longer fall down.
The StorageMojo take
This issue is central to the question of the scalability of storage and computing. Storage arrays are small scale solutions. Redundant data centers are large scale solutions.
Today’s storage array costs are 90-95% for “protection” and performance while only 5-10% is for capacity. Data center redundancy and simple, software-based replication strategies reverse that trend.
If we accept James Hamilton’s calculation that Internet scale storage costs 1/5th that of enterprise storage, then it also follows that enterprises without multiple data centers won’t be able to achieve high scalability and disaster tolerance. The implications for private clouds are obvious.
More on this later, no doubt.
Courteous comments welcome, of course. The more opinions I see about cloud storage the more I believe that “economies of scale” is an alien concept to many observers – perhaps because there haven’t been many in enterprise data centers.
Thanks Robin. The unit of measurement for storage is clearly expanding. It will more and more be about scale.
Just like your gravity analogy though, it’s about more than removing the belt and suspenders. There will be a lot of repercussions! Disruptive change for sure.
I disagree that replication is simple, especially if you care about consistency. Let’s see what consistent (e.g. NFS or iSCSI) cloud storage costs.
Wes,
As Alyssa noted in her talk:
Amazon sacrifices some consistency for availability. And sacrifices some availability for durability.
Cloud storage isn’t a replacement for high-performance local storage. But given what we know about access patterns, consistency is not a big issue for most applications.
Robin
This is where we have a terminology problem; we’ve defined “cloud storage” to mean “crap blob storage in the cloud”. I think everyone, even clouds, needs some high-performance local storage (like EBS for example). So is this stuff not cloud storage? Is it merely a SAN inside the cloud?
Wes, do you really a storage system that automatically replicates your data to multiple data centers for pennies a month is “crap blob storage?”
Granted, the latency and bandwidth aren’t what we’d see locally – but think of cloud storage as a component or subsystem rather than a finished product. It’s what we wrap around it that makes it a product.
Robin
Totally agree, as you succinctly put it – only 5-10% of the solution cost is the hardware – utilising an extra level of RAID/Mirroring should not be a stopper for enterprise level protection of valuable data.
But why stop there? Take this stat: http://gbr.pepperdine.edu/033/dataloss.html – only 40% of data losses are put down to hardware failure (I doubt if the number is higher in the enterprise market, if anything significantly lower). The second most common reason for data loss is put down to human error. I’ve often seen mid-sized companies relying on the technical skills/caffeine level of individuals, rather than maintaining data in ways that avoid single-points-of-failure at hardware OR at a human level.
Yes, belt and brace your storage, that’s the inexpensive part; but also belt and brace procedures/the human element if your data is valuable.
The idea of using multiple, distributed copies to avoid RAID overheads sounds fine, but it won’t work with a typical databases running transactional workloads. Generally speaking, if you lose one unprotected disk out of your RDBMS, then you often lose the whole thing until it is recovered. If you are unwise enough not to be replicating your logs you lose data. The sort of software packages that many enterprise use are very intolerant of data inconsistencies and storage service disruptions. With many of integrated systems, the inter-dependencies are such that service disruptions on one has immediate effects on other components. Dealing with the myriad of failure conditions that can occur between hundreds of applications that makes up the enterprise is an almost impossible task. For this reason, many complex enterprises do spend a lot of effort trying to make sure that failures in the basic infrastructure are as invisible as possible. Hence the emphasis on gold-plated systems, as dealing with the conseequences of a failure, and the subsequent service loss, can be higher than the costs in the infrastructure.
It could be argued that the problem here is that the enterprise systems architecture is designed wrongly – too much emphasis on multiple single points of failure (where the single point of failures are application instances as well as infrastructure hardware items). However, designing the resilience and scalability into the application architecture in the first place is a very skilled job, and beyond the resource capabilities of many even quite large enterprises.
If this type of storage (and processing) in the cloud is really to win out for major customers in the enterprise space, then it is going to have to deal with the issues of service maintenance in a transparent manner. It’s going to have to do it in a way that enterprises can come to grips with – that is running the sort of applications that are of use to business. That could well mean that the “processing in the cloud” model really needs to move up the system value chain.
None of this is to deny that there are some sorts of workloads which have more relaxed requirements. For instance, many data archival systems, document management or read-intensive content delivery systems might well fit this model well. But some of this can be done (and is being done) with low-cost storage already. It’s for this reason that I rather doubt some of the cost comparisons. I’m not wholly sure that like comparisons are being made. If I’m to make a simple comparision with my extensive photographic collection, then I rather suspect that my rather primitive technique of synchonising a USB external drive and keeping it at the office provides for a far more cost-effective data protection solution than storage-in-the-cloud. However, I wouldn’t seek to use that as a proof that the future of storage is in the form of external USB drives. It’s merely a capital cost-effective solution for one sort of application.
This sort of sums it up when Steve Jones wrote:
“The idea of using multiple, distributed copies to avoid RAID overheads sounds fine, but it won’t work with a typical databases running transactional workloads.”
I consider this to be a modern analog that “Horseless carriages sound fine but they just won’t work if they keep running over the horse you have pulling them.” Perhaps that doesn’t work as well as I would like but basically thinking things differently at scale is key, and frankly if you’re considering cloud storage and trying to hook it to a “typical database” your doing it wrong.
Now it is perfectly understandable to say “I’ve got all my data locked up inside Oracle data bases and they don’t support either cloud storage or a way to migrate out of their databases so I’m stuck with a huge problem.” A lot of CIOs are looking down the barrel of that gun and wondering what to do.
If they get past that particular problem then they realize that an architecture like S3 or most of the other “big” clouds has this issue they they can be flinging around more storage then you can reasonably visualize, how do you back up a data center? With another data center but where does that stop? Or does it? What is the exabyte backup strategy that gives you an RTO of less than a month much less sooner than an hour or so.
What happens when your storage provider becomes insolvent? What happens when the software vendor who maintains the code that reads your data goes away? We have the ‘bit rot’ problem today where data stored on 7 track tapes can’t be read by any piece of equipment today outside of a museum, what happens when your data, living in the cloud becomes inaccessible because the cloud provider goes IPv6 only or moves their APIs and some middleware you depend on is no longer actively maintained?
I don’t believe the “cloud computing” world is being held back by technology, there are lots of creative ideas out there for both in house and extra-net type clouds. The challenges are more and more about data longevity and recovery.
–Chuck
for someone who grew up on ACID the idea of ‘eventually consistent’ takes some getting used to.
Thanks for the very interesting discussion.