The cost of flash versus even 15k FC drives has made it common practice to compare “usable gigabytes” to disk array capacity. Pure Storageand, lately, HP, have invoked the idea that, with proper techniques, usable flash capacity can be competitive with the per-gigabyte cost of disk arrays.
I wrote about this on ZDNet and Martin Glassborow of StorageBod.com tweeted that he’d taken a more pessimistic view in a post and concluded
The problem is that many of us don’t have time to carry out proper engineering tests; so I find it best to be as pessimistic as possible…I’d rather be pleasantly surprised than have an horrible shock.
I’m with Martin. Pleasant surprises beat horrible ones every day.
Contingency
While I’d noted the contingent nature of these techniques – compression only works if your data is compressible, for instance – I’d concluded that, in the main, what flash vendors are asserting is legitimate. Yes, there’s a risk – a black swan – that your data’s entropy drops to zero rises to 100%, making it incompressible; that every thin-provisioned app needs its full allotment; that massive updates force snapshots to copy everything while writing and, incidentally, rendering deduplication useless.
In other words, welcome to storage hell.
How much pessimism can you afford?
Yet that view at bottom says we should continue to overconfigure storage to cover any eventuality. Which is nice if you can afford it.
Enterprise disk arrays typically use 30-40 percent of their expensive capacity. While disk systems could use some of these techniques — and one wonders why they haven’t – which has left a clear field for flash vendors.
The point flash vendors are making is that given what flash costs AND its enormous benefits, we should to relax our paranoia a couple of notches, use these techniques, and achieve cost parity with traditional, expensively overconfigured, high-end arrays. Think eventual consistency, which enables much of the goodness of the cloud, while introducing some new problems to manage.
The end of theory and the beginning of wisdom
The techniques flash vendors employ include some old standbys as well as more modern technologies.
- Compression.
- De-duplication.
- Advanced erasure codes.
- Thin provisioning.
- Snapshots.
All make assumptions about data and/or usage that may not always apply. For example, LZW assumes that data is compressible — i.e., approximately 50 percent entropy — but if you give it already compressed data, it’s stuck and your nominally “available” capacity suddenly drops. People see this problem with tape, but tape survives.
De-duplication keeps one copy of your data, plus a list of pointers and changes. If that list is corrupted, so is your data, maybe lots of data. So those data structures need to be bulletproof. Yet this too seems manageable.
Thin provisioning assumes that all apps aren’t going to want all their provisioned capacity all at once. A pretty safe bet, but a bet nonetheless.
The StorageMojo questions for readers is:
Have any of these techniques bitten you?
What happened?
What did you do?
Under what circumstances where you assume you only have raw capacity available?
Please be as specific as time and memory allow.
The StorageMojo take
The goal here is to replace the “there be dragons” fear of the unknown with some guideposts. Likelihood, warning signs, preventive action.
Storage people are innately conservative – it’s what we do – but if we can triple the effective capacity of our data processing by using flash at the cost of the loss of 0.00001% of uptime, don’t we have an obligation to accept the risk?
Or should we ignore vendor positioning and insist on enough raw capacity to handle every contingency and damn the cost?
Courteous comments welcome, of course. What say you?
Disclaimer: I’m a NetApp employee ; all comments are my own.
In the grand scheme of things, why would these technologies shift the balance in favor of flash over disk? Broadly speaking, compression and deduplication allow reads and writes from disk to require fewer operations, providing opportunities to increase performance and density as well. Thin-provisioning and snapshots are likewise great data-management features, and seem to benefit disk users as much as flash users.
The large erase block size of flash memory and comparatively high latency of the flash read-erase-write cycle are probably easy targets for optimization through the listed technologies. When it comes to spinning rust, the average time saved by requiring 1/2 the disk throughput through compressiong and dedupe is probably a larger number in absolute terms than the time reduction for flash, even though it might be a higher percentage for flash. Depending upon your particular needs, these technologies might tilt the value proposition further away from flash rather than towards it.
Ahem. Zero entropy means complete predictability. If your data’s entropy drops to zero, you don’t need any space to store it, because you can reconstruct it from nothing. 🙂
— Jerry
Since I have never implemented thin provisioning, it may be a bit a prejudice, but I tend to consider it a risky feature… what happens if applications end up claiming their entire LUN space? not a good deal…..
I think the first question that needs to be asked is how badly do I need flash level performance.
Flash and AFA’s are all the rage right now. Yet I would contend that there are very few environments that truly need 450,000 IOPS.
Therefore, if spinning disk (or spinning disk plus a small flash tier or flash cache) will get you there, then I see no need to accept any risk. May as well go with what is proven.
If you do truly need an AFA, then I think one has almost no choice but to accept the risk. One needs to be smart though and realize that the results associated with compression and deduplication are unpredictable.
For instance, if you have 100 Windows 2012 vms, and you run windows updates on half of them, what happens to your dedupe hits?
As far as the risk associated with deduplication, netapp style post process dedupe has been around for quite awhile and I’m not aware of any additional risk associates with it.
Inline dedupe is still pretty new, but I think the risk is minimal, if it exists at all.
The bigger question in my own mind is why am I even looking at an AFA?
In an identical configuration, the usable capacity should largely be similar in flash and disk systems. However, flash will provide additional benefits “if” these features indeed are realizable in any deployment. For example, deeply deduped disk systems might make a few performance aspects pretty awful, for example read of a dedupe file or even random writes. Similar impact can be caused by snapshots as well. Similarly, compression can also increase durability of flashes in addition to data transfer bandwidth savings. The compression savings at smaller blocks can have far greater impact in overall performance gains as they are directly realizable on flash than on drives. On drives, the data transfer times for small blocks might be just a small noise in comparison to seek and rotational latency. In short, if a deployment indeed realizes additional savings, the flash will be superior medium to provide net “practical” gain in comparison to disk drives.
Will be brief, on phone.
By far the most common thing I have seen bite people is thin provisioning. Especially when they do it on the array AND on the VMDK. Unsurprisingly, all the instances have had insufficient monitoring. Nagios (or whatever) is your friend people.
That shiney new platform is not finished until it you have full visibility of what is going on at every level
Jerry, thanks for the catch. I’ll update the post. That’s what I get for writing at 5am!
Robin
I believe the costs of SSD memory will drop to a level that will be very competitive to standard disks, as SSD sizes grows while standard disk sizes stays the same. But until that time, marketing will do everything possible to present it as a viable alternative. All of the techniques used can reduce the amount of space used, but I look at them as a bonus in two fronts: to undo the overhead created by storage (a 4TB raw storage solution has 2TB usable), and to extend the life of the controller (make it last 5 years instead of 3 years). All of these techniques can, and are, applied to standard disks too, so there is no inherent value, and they are part of the storage evolution.
In my experience, as Chris said, thin provisioning seems to be biggest issue, as the other techniques reduce usable space, but thin provisioning expands space. When it comes to snapshots, usually deleting many of them to free up space has caused delays to serving data.
AFAs are not just about high IOPS, they also have low latency. Many times for applications like VDI or large databases, the low latency is very important. We run an AFA for our VDI deployment and are very happy. As the price of SSD drops, I believe it will replace most tier1 spinning disk. Our array uses dedup and compression and our overall ratio is 3:1. We use thin provisioning for all of our VMs and 1.5 years into our deployment all is well.