Actually, StorageMojo is wrong, wrong, wrong, wrong and wrong. Umesh Maheshwari of Nimble Storage wrote a detailed and thoughtful response to the StorageMojo post Are SSD-based arrays a bad idea?
The StorageMojo take
Umesh makes good points, but perhaps due to Nimble’s hybrid disk/SSD architecture some seem to miss the mark. What’s missing is ample consideration of what the alternative to an SSD might be, a problem Nimble didn’t have because SSDs work fine, technically and economically, for their hybrid system.
For example, on the issue of reliability Umesh says:
Reliability. There are good reasons to be able to replace failed flash devices similar to how hard disks can be hot swapped. The raw bit error rate (RBER) of flash is actually worse than that of hard disks, and it gets worse as blocks are rewritten. It is also getting worse as manufacturers are moving to increase density. (See this paper from FAST 2012: The Bleak Future of NAND Flash and a related blog post.)
This is correct, but based on the Google/Bianca Schroeder research, the StorageMojo point is that the disk electronics – apart from the head/media pieces – are a major – 40%-50% – source of HDD/SSD failures. The flash controller has to handle the RBER and declining flash performance, but why add the other HDD bits that account for a substantial percentage of drive failures?
I could niggle about Umesh’s other points, but what fun is that? StorageMojo readers are encouraged to check out Umesh’s post and make up their own minds.
Courteous comments welcome, of course. I recently did a nifty video white paper for Nimble, which is a great intro to their innovative architecture. Check it out.
The other difference between disk failures and solid state failure are that solid state failures are much more predictable. Because disk failures tend to be component related (as Robin points out), you never know when it’s going to occur. Contrast that with bit errors due to wear in solid state. Not only can you monitor how many times cells are written too but additional cells can be provisioned up front so that as cells wear out, new ones come online. The rate at which you’re consuming new cells can be monitored as well. That’s a LOT more predictable than a disk drive failure.