Or a reasonable facsimile thereof
If you are interested in Disaster Recovery check out Axxana. They solve the limited synchronous data copy distance problem with a black box designed for data. Concept is simple but getting the details right is hard.
The problem
Synchronous replication requires that apps wait until the remote site completes the write. Given the speed of light, that means that synch sites can’t be very far away. Certainly not the 300 miles the SEC would like to see for financial institutions – we still have a few of those, don’t we?
Axxana’s answer
No matter what happens in a plane crash, they always seem to be able to recover the “black box” that tells them what the plane was doing shortly before the crash. Axxana has developed a black box for data centers.
Here’s how they describe it:
The Phoenix Black Box is located near the storage system at the primary data center and records a synchronous data stream from the storage. At the same time, an asynchronous data replication system is moving data to a secondary data center (the remote recovery site). The Phoenix Black Box has to protect only the Gigabytes of data that would have been lost in a typical asynchronous replication scenario. Data is protected inside the Black Box during the course of the disaster and can be immediately extracted.
Data extraction is achieved either by:
- Physically locating the system by tracking the homing signal and connecting a laptop with an Axxana software component to the Phoenix Systemâ„¢ at the disaster site, or
- The self sufficient and well protected system transferring the data to the secondary site using highly resilient cellular broadband technology.
Your data phones home after a disaster.
Compelling economics
It will take a while to suss out all the implications, but one simple scenario is a company with 3 data centers around the world could in-source their DR strategy with the equivalent of synchronous data recovery. How much would that save?
Distribution
They are working with as many of the major vendors as they can to get the product to you through people you already deal with. Expect to see some announcements.
The StorageMojo take
They are in contention for StorageMojo’s “coolest new product as SNW” award. It looks like they can handle anything up to an A-bomb blast. If that happens even synchronous data replication may not work. Besides, a dirty bomb is much more likely. Happy thoughts, eh?
Comments welcome, of course. Guys, sorry if I jumped the gun. But when I saw the web site was up . . . .
Robin– Nice piece on Axxana. Rumor has it that Andy Grove came out of retirement to solve that speed of light problem but even he couldn’t crack it! I’d add to your scenarios a midrange shop (let’s say a mid-sized bank) that can’t afford to go high end synchronous so they settle for Asynch and risk some exposure. This allows them to turn Asynch to near zero data loss for a fraction of the cost of synch…I’m sure there are others. -Cheers – Dave from Wikibon.
Robin, thanks for this; it certainly looks very cool, it’d fix some major headaches with having a big expanse of sea between our primary data-centres.
This sort of staging has been possible before, but it’s been very expensive. generally you needed a full copy outside of the immediate “blast radius” (charming term) to which you synchronised data synchronously and then asynch from there. If could be done with EMC, Veritas Volume Replicator and no doubt with others too. This was very expensive as it required full copies of data, environmental support and so on.
Using what amounts to an I/O logging system to which you write data synchronously vastly reduces the cost, but that still has to be placed somewhere and with appropriate network communications such that it, and the onward asynchronous copying, is not affected by the disaster. A wide enough problem could affect all terrestrial communications to I guess that satellite might be the ultimate fallback.
In the case of the vast majority of our systems we elect for asynchronous replication on cost and performance grounds and work on the basis that the resulting inconsistancies can be sorted out by manual and exeception processes (and this is truly for disaster scenarious – not as an HA approach). However, that’s not an option with many safety or regulatory related systems. For instance, losing a few bank transactions isn’t an option nor would losing critical health care data.
If this sort of approach works, and it gets rid of the logical issues of recovering from inconsistencies caused by asynchronous replication then it could be very valuable indeed. But DR is an incredibly difficult thing to get right.
Editors at Network World think Axxana is worth keeping a close eye on, too. We’ve named Axxana one of our 10 start-ups to watch for 2009 (http://www.networkworld.com/supp/2009/outlook/010509-startups-to-watch.html) and feature the company’s black box in a piece on cool flash-based storage products for the enterprise (see http://www.networkworld.com/supp/2009/ndc1/012609-flash-storage.html) .