More is coming on SSDs RSN, but in the meantime there is the following piece from Virsto’s Eric Burgener on HA considerations for SSDs. Virsto is a software company focused on making VIRtual STOrage for VMware and HyperV much more functional than the physical kind.

Thus Eric’s response has a very particular POV: what is needed to use SSDs in a virtual environment to ensure high availability – from a company that DOESN’T sell HA hardware. One key point: virtual server environments are much more write intensive than most enterprise apps, so using SSDs as a cache is a losing strategy.

If that is intriguing, read on!

In Thinking About SSD, You Canâ€™t Leave HA Considerations Out

In the â€œhost vs array-based SSDâ€ discussion as it pertains to enterprise accounts, the need for HA must play a critical role. This is true whether youâ€™re working with physical or virtual environments. Any committed data that is not sitting on a shared, non-volatile, external storage device and accessible by at least one other node cannot be recovered until that failed node (on which it resides locally) is brought back up.

There are technical ways to solve this using synchronous replication technologies, but thatâ€™s an extra credit project you do yourself for now â€“ as of yet, that hasnâ€™t been built into any host-based SSD products. The reality today is that using host-based SSD precludes the use of HA (but not necessarily things like vMotion, which is NOT HA).

This was touched on in some other posts, but I think itâ€™s an increasingly critical issue in virtual computing environments that may have been a bit downplayed in other comments. If youâ€™re either thinking about moving production server workloads to VMs or have already got them there, HA is critical for a high percentage of workloads.

I canâ€™t imagine an enterprise customer specâ€™ing out a production virtual server environment without asking about HA. True, there are workloads that donâ€™t require it, but most do.

And itâ€™s not just virtual server environments. Weâ€™re running into an increasing number of VDI environments where they want to enable HA for at least a small percentage of the desktops â€“ usually executive desktops. HA isnâ€™t a deal breaker for VDI like it is for â€œVSIâ€ (virtual server infrastructure), but there are clearly use cases in VDI where you want and/or need it.

Today SSD is pretty much only used as a cache, regardless of where its deployed. And to provide a given level of performance speedup, caches generally have been sized at somewhere around 2% – 4% of the primary data store (it varies by application and exactly what youâ€™re trying to speed up).

In virtual environments, write performance is much more critical because it tends to comprise a much higher percentage of the read/write workload â€“ in VSI environments its not uncommon to see 50% reads/50% writes, and in VDI environments weâ€™ve seen 70% write environments. Unless youâ€™re using a write back cache (with all the attendant additional expense associated with that), youâ€™re not going to get any write performance speedup from the conventional cache architectures, just read.

But now think about what a log architecture, applied at the storage layer, could add. Circular logs that are continuously draining (asynchronously) as they are filling need very little storage capacity to speed up ALL writes for ALL VMs ALL the time. In our experience, you need a log of about 10GB in size for each heavily loaded physical host.

Think about what that could mean for a 16 host environment with 20TB. You could get away with 2-4 200GB enterprise flash drives instead of the 10-12 that you might otherwise deploy in a 20TB environment. If you have a â€œlinked cloneâ€ type snapshot technology combined with storage tiering, you could take the extra SSD capacity and create a tier 0 for critical VMs that need very high read performance, like for example the golden masters in a VDI environment or common templates you use to create your server VMs.

This covers both needs â€“ read and write performance â€“ using a lot less storage. That means pretty much the same performance youâ€™d get with the more expensive configuration with more SSDs for a lot less money. If you want to use SSD efficiently, a log architecture is a great idea. And if the logs are placed in shared, non-volatile, external storage (like the SSDs hosted in a SAN array or SAN-based SSD appliance), you can fully support HA.

Host-based SSD cards are closer to the physical host so theoretically theyâ€™ll provide more performance speedup, but given Amdahlâ€™s law, how much of that can you really use? Array-based SSD will still get you past storage latencies as your critical bottleneck, and if theyâ€™re implemented using a log based architecture youâ€™ll get HA and large write performance speedups as well using a lot less of it.

The StorageMojo take
Write-through caches avoid a lot of sticky update synchronization problems, but as Eric notes they aren’t the best choice in write-intensive environments. And HA adds to the requirements: the cache must be network accessible.

But his larger point bears repeating: SSDs are wonderful for handling metadata. And as we move to object storage and more metadata that capability will become even more valuable.

Courteous comments welcome, of course. I think Virsto’s architecture is smart and fixes some real problems with VMware and vMotion.

3 Comments

nate on Friday, 13 April, 2012 at 2:34 pm

I was surprised, no shocked, when we moved to an entirely vmware-environment from a public cloud provider. The public cloud gave us no real useful metrics for things like I/O. So I was shocked really when I saw our front end i/o ratio was 90% write (92% write avg over past 24h!). Back end after cache is about 59% write(past 24h). Really was not expecting that at all. So much stuff is kept in memory on the servers that reads are really low.

I guess you could say our environment is VSI (Virtual servers ? had not heard of that term VSI before). No VDI here.

The applications running in the environment are basically LAMP stacks running e-commerce. Combination of production and non production workloads. + all sorts of support/operations type applications monitoring etc. When we first moved over we found that our main monitoring tool was consuming about 10x the I/O of our production databases, so we had to modify it’s configuration to not eat so many valuable IOPS.
nate on Friday, 13 April, 2012 at 2:40 pm

This particular workload and company is small – so there’s not a lot of aggregate load – fortunately the array handles it well – front end service times average under two milliseconds (maybe 1.7) for writes – 2-5ms for reads.

Two storage controllers each with 6GB of data cache. spikes to as 6-8 or even 10ms when big batch jobs kick off in the middle of the night. Average back end spindle response time for both reads and writes is right around 15ms, with spikes to around 20-30ms during those batch jobs. Most volumes are RAID 50 3+1.
Robin Harris on Friday, 13 April, 2012 at 2:56 pm

Nate,

Thanks for the stats on your implementation. Good to get real world validation.

Cheers,

Robin