I moderated a panel on cloud storage at Tom Coughlin’s Storage Visions 2010 conference. Some good stuff came out of it.
4 companies presented: IBM, Bycast, Cleversafe and Asankya.
IBM, now a services company, talked about the service needs of cloud providers or cloud customers.
Bycast, which may have the largest installed base of any cloud software provider, presented on the process that they typically see for private cloud implementation. My interpretation of the process:
- Edge sites install a gateway node to the central private cloud repository
- The edge site learns what its local data needs are
- A local disk cache is added to the gateway node to improve performance
- A workable balance between local wants and economics is achieved.
It took 3 years for the enterprise to go from pilot to start full deployment. Data storage rose from 36 TB at the end of year 1 to 750 TB at the end of year 6.
Cleversafe may be the leader in implementing advanced erasure codes in storage software. RAID 5 & 6 are both forms of erasure codes, but the math has been refined in the last 20 years. Much higher levels of data availability with lower overhead are now possible.
As disk capacities climb and disk error rates remain constant, the expected annual data loss rises. By 2020 you can expect that a 1,000 disk storage farm will lose over 200 GB of data annually – even with mirrored RAID 6. (RAID 16? The mind boggles).
Advanced erasure codes combined with physically dispersed storage make all that go away. Cleversafe estimates that a dispersed storage infrastructure requiring 10 of 16 nodes to reconstruct the data is 1,000,000 times more reliable than RAID 16 – reducing expected data loss from 200 GB to 200 KB.
If Bycast has proven private cloud software and Cleversafe has disaster-proof storage, then we’re done, right? Except for the freakin’ network latency that makes “cloud” storage synomous with “slow” storage. That’s where Asankya comes in.
Their basic insight is this: TCP/IP was built when a 200 nanosecond CPU and a couple of meg of RAM was a Hot Box. What if we were to change the protocol to take advantage of modern resources – could we do better? Well, duh!
They’ve developed the RAPID protocol and an overlay network called RAPIDnet that they claim dramatically improves network performance. How?
- Multipathing. Instead of tying a session to a single network path, RAPID decides on a per-packet basis the fastest route for that packet.
- Maximum bandwidth utilization. Multiple paths means more available bandwidth – and RAPID loads each path as full as it can.
- Network deduplication. Originating nodes keep track of all packets that pass through, so when a duplicate packet shows up it doesn’t resend it.
Net net: by increasing bandwidth and reducing delays, Asankya cuts latency, making cloud storage much more feasible for interactive apps. Cool!
Of course, this all has to work in the Real World. Evidently it does, as they have customers. And the technology came out of Georgia Tech.
The StorageMojo take
The latter 3 companies make an important point about cloud storage and computing: we can do much more to make it economical, safe and fast. That’s a Very Good Thing.
Asankya is asks if network intelligence should be in the core or on the edge? Cisco, of course, prefers a smart core, so Asankya is a clear threat to them. The rest of us might disagree.
Courteous comments welcome, of course. I’m doing some work for Bycast, but, alas, not for the other companies. Thanks to Tom Coughlin for assembling a good group for the panel. I’m hoping I can post links to more info on all of them.
“By 2020 you can expect that a 1,000 disk storage farm will lose over 200 GB of data annually – even with mirrored RAID 6. (RAID 16? The mind boggles). ”
Today some of us know large DMX3/DMX4 consumers with more than 1000 disks in their frames and they certainly aren’t losing any data annually. But there is this “future” scare.
I suppose it would have been impolite to shoot a hand up and ask the
speaker what he/she was talking about and challenge?
For instance, what about adoption of 4K sectors and improved ECC.
From a numbers perspective, Western Digital estimates that the use of 4K sectors will give them an immediate 7%-11% increase in format efficiency. ECC burst error correction stands to improve by 50%, and the overall error rate capability improves by 2 orders of magnitude.
“even with mirrored RAID6”
Doubt it. Think about how many unreadable AND uncorrectable blocks that would mean. Maybe if you couldn’t even write to them in the first place! Ha.
The phone companies have smart cores, the internets don’t. They put the smarts on the edge. Seems to me the internets do pretty well. And no mumbo jumbo “fixes” the speed of light in a fiber (2.1 x 10^8 m/s). Doesn’t matter if the MJ comes from Georgia Tech, UCSD, Stanford or Berkeley. Still enjoying your posts Robin.
Regarding Cleversoft. . .I was excited until I visited their webpage. It’s high on fluff and low on details, which isn’t too surprising I guess. But it looks like you need one “Slicestor” per storage node. How much do they charge for said object? how much power does it use? In one case study they have “3TB raw” per Slicestor. Seriously? 3TB per appliance? How can that possibly be economical?
Elsewhere they make a statement that “$2.45/GB” is a “competetive” price for storage. In what decade? If their products are priced with that in mind, I have zero interest in them.
Finally – performance. There’s zero information on what you can expect from each Slicestor, or a single Accesser or Manager. Is either the Manager or the Accesser a bottleneck? How does the data flow? How does each Slicestor connect to the storage?
If you look around the net, you will find articles that tested Cleversafe. The reviews said: *slow*.
In other news, I do strategic storage assessment for my company. When deploying file piles (NAS on Tier 2-) at a large scale (1PB increment), you have to be in the $1/GB *usable* to even interest me in reading the material.
Cloud storage will become more and more widely adopted. At this time, there are still a few things that limit its adoption:
(1) Speed. The cloud storage is still much slower;
(2) Security and ownership: A lot of people are concerned about security and control when the data is stored in another company’s facility;
(3) Change to existing applications. Moving to cloud storage also means moving applications to the cloud computing, which is a big change;
(4) Inflexibility in new or customized features / services. Cloud storage is offered by a service provider. If you need to customize the service, it is up to the service provider.
http://www.DriveHQ.com offers a better cloud service that can solve the above problems. You can backup your files to the cloud with data encryption; sync the cloud storage with your local storage; work on your local storage or cached storage without any performance impact; and you have full flexibility in features and you can continue using your existing software. In short, DriveHQ offers more than just storage; it also offers cloud services that can replace your local file server, ftp server, email server and backup system. For more info, please visit: http://www.drivehq.com/