I moderated a panel on cloud storage at Tom Coughlin’s Storage Visions 2010 conference. Some good stuff came out of it.

4 companies presented: IBM, Bycast, Cleversafe and Asankya.

IBM, now a services company, talked about the service needs of cloud providers or cloud customers.

Bycast, which may have the largest installed base of any cloud software provider, presented on the process that they typically see for private cloud implementation. My interpretation of the process:

  • Edge sites install a gateway node to the central private cloud repository
  • The edge site learns what its local data needs are
  • A local disk cache is added to the gateway node to improve performance
  • A workable balance between local wants and economics is achieved.

It took 3 years for the enterprise to go from pilot to start full deployment. Data storage rose from 36 TB at the end of year 1 to 750 TB at the end of year 6.

Cleversafe may be the leader in implementing advanced erasure codes in storage software. RAID 5 & 6 are both forms of erasure codes, but the math has been refined in the last 20 years. Much higher levels of data availability with lower overhead are now possible.

As disk capacities climb and disk error rates remain constant, the expected annual data loss rises. By 2020 you can expect that a 1,000 disk storage farm will lose over 200 GB of data annually – even with mirrored RAID 6. (RAID 16? The mind boggles).

Advanced erasure codes combined with physically dispersed storage make all that go away. Cleversafe estimates that a dispersed storage infrastructure requiring 10 of 16 nodes to reconstruct the data is 1,000,000 times more reliable than RAID 16 – reducing expected data loss from 200 GB to 200 KB.

If Bycast has proven private cloud software and Cleversafe has disaster-proof storage, then we’re done, right? Except for the freakin’ network latency that makes “cloud” storage synomous with “slow” storage. That’s where Asankya comes in.

Their basic insight is this: TCP/IP was built when a 200 nanosecond CPU and a couple of meg of RAM was a Hot Box. What if we were to change the protocol to take advantage of modern resources – could we do better? Well, duh!

They’ve developed the RAPID protocol and an overlay network called RAPIDnet that they claim dramatically improves network performance. How?

  • Multipathing. Instead of tying a session to a single network path, RAPID decides on a per-packet basis the fastest route for that packet.
  • Maximum bandwidth utilization. Multiple paths means more available bandwidth – and RAPID loads each path as full as it can.
  • Network deduplication. Originating nodes keep track of all packets that pass through, so when a duplicate packet shows up it doesn’t resend it.

Net net: by increasing bandwidth and reducing delays, Asankya cuts latency, making cloud storage much more feasible for interactive apps. Cool!

Of course, this all has to work in the Real World. Evidently it does, as they have customers. And the technology came out of Georgia Tech.

The StorageMojo take
The latter 3 companies make an important point about cloud storage and computing: we can do much more to make it economical, safe and fast. That’s a Very Good Thing.

Asankya is asks if network intelligence should be in the core or on the edge? Cisco, of course, prefers a smart core, so Asankya is a clear threat to them. The rest of us might disagree.

Courteous comments welcome, of course. I’m doing some work for Bycast, but, alas, not for the other companies. Thanks to Tom Coughlin for assembling a good group for the panel. I’m hoping I can post links to more info on all of them.