More is coming on SSDs RSN, but in the meantime there is the following piece from Virsto’s Eric Burgener on HA considerations for SSDs. Virsto is a software company focused on making VIRtual STOrage for VMware and HyperV much more functional than the physical kind.

Thus Eric’s response has a very particular POV: what is needed to use SSDs in a virtual environment to ensure high availability – from a company that DOESN’T sell HA hardware. One key point: virtual server environments are much more write intensive than most enterprise apps, so using SSDs as a cache is a losing strategy.

If that is intriguing, read on!

In Thinking About SSD, You Can’t Leave HA Considerations Out

In the “host vs array-based SSD” discussion as it pertains to enterprise accounts, the need for HA must play a critical role. This is true whether you’re working with physical or virtual environments. Any committed data that is not sitting on a shared, non-volatile, external storage device and accessible by at least one other node cannot be recovered until that failed node (on which it resides locally) is brought back up.

There are technical ways to solve this using synchronous replication technologies, but that’s an extra credit project you do yourself for now – as of yet, that hasn’t been built into any host-based SSD products. The reality today is that using host-based SSD precludes the use of HA (but not necessarily things like vMotion, which is NOT HA).

This was touched on in some other posts, but I think it’s an increasingly critical issue in virtual computing environments that may have been a bit downplayed in other comments. If you’re either thinking about moving production server workloads to VMs or have already got them there, HA is critical for a high percentage of workloads.

I can’t imagine an enterprise customer spec’ing out a production virtual server environment without asking about HA. True, there are workloads that don’t require it, but most do.

And it’s not just virtual server environments. We’re running into an increasing number of VDI environments where they want to enable HA for at least a small percentage of the desktops – usually executive desktops. HA isn’t a deal breaker for VDI like it is for “VSI” (virtual server infrastructure), but there are clearly use cases in VDI where you want and/or need it.

Today SSD is pretty much only used as a cache, regardless of where its deployed. And to provide a given level of performance speedup, caches generally have been sized at somewhere around 2% – 4% of the primary data store (it varies by application and exactly what you’re trying to speed up).

In virtual environments, write performance is much more critical because it tends to comprise a much higher percentage of the read/write workload – in VSI environments its not uncommon to see 50% reads/50% writes, and in VDI environments we’ve seen 70% write environments. Unless you’re using a write back cache (with all the attendant additional expense associated with that), you’re not going to get any write performance speedup from the conventional cache architectures, just read.

But now think about what a log architecture, applied at the storage layer, could add. Circular logs that are continuously draining (asynchronously) as they are filling need very little storage capacity to speed up ALL writes for ALL VMs ALL the time. In our experience, you need a log of about 10GB in size for each heavily loaded physical host.

Think about what that could mean for a 16 host environment with 20TB. You could get away with 2-4 200GB enterprise flash drives instead of the 10-12 that you might otherwise deploy in a 20TB environment. If you have a “linked clone” type snapshot technology combined with storage tiering, you could take the extra SSD capacity and create a tier 0 for critical VMs that need very high read performance, like for example the golden masters in a VDI environment or common templates you use to create your server VMs.

This covers both needs – read and write performance – using a lot less storage. That means pretty much the same performance you’d get with the more expensive configuration with more SSDs for a lot less money. If you want to use SSD efficiently, a log architecture is a great idea. And if the logs are placed in shared, non-volatile, external storage (like the SSDs hosted in a SAN array or SAN-based SSD appliance), you can fully support HA.

Host-based SSD cards are closer to the physical host so theoretically they’ll provide more performance speedup, but given Amdahl’s law, how much of that can you really use? Array-based SSD will still get you past storage latencies as your critical bottleneck, and if they’re implemented using a log based architecture you’ll get HA and large write performance speedups as well using a lot less of it.

The StorageMojo take
Write-through caches avoid a lot of sticky update synchronization problems, but as Eric notes they aren’t the best choice in write-intensive environments. And HA adds to the requirements: the cache must be network accessible.

But his larger point bears repeating: SSDs are wonderful for handling metadata. And as we move to object storage and more metadata that capability will become even more valuable.

Courteous comments welcome, of course. I think Virsto’s architecture is smart and fixes some real problems with VMware and vMotion.