Even though disk storage gets about 5-10% cheaper every quarter, people still hate paying for it. A new CPU goes faster, a new display is brighter and/or bigger, but new storage just sits there until we fill it up.

For that reason, the idea of RAID 5 (see the World’s Shortest RAID Guide) seems to hold a hypnotic attraction for customers everywhere. While I understand that cheaper and almost as good is a win for most of us, RAID 5 is a mixed bag that may not do what you need, even if it does what you want.

Start RAID 5 definition
A formal engineering definition of RAID would require using words that I think many people would need defined as well, so I’m not going there. Operationally, a RAID 5 controller calculates data recovery information (parity) and spreads your data and the data recovery information across several disks, usually 4-10 disks. The big advantage of RAID 5 is that it protects your data while using only the capacity of one disk to do so. So if you have 6 400GB disks in a RAID 5 configuration, you have 2000 GB ( 6 * 400GB = 2400GB less the one 400GB disk of recovery info) of usable data storage capacity.

If you mirrored (RAID 1) those 6 400GB disks, you would only have 1200GB of usable capacity. Same disks, same power & space requirements, but 40% less capacity. For what?
End RAID 5 Definition

The technical answer to that last question is complicated, because it depends on what you are doing and how the RAID 5 is engineered. The non-technical (i.e. not for gearheads) answer is that by maintaining two complete copies of your data, RAID 1 (and its sibling RAID 1+0) will often complete individual reads faster, usually complete writes faster, and when a disk fails will protect your data better.

If there is a second disk failure in a RAID 5 disk group, ALL the data is LOST. Gone. Pff-f-f-t. So the natural question has always been: “How likely is a second disk failure?” Take the disk vendor’s MTBF (mean time between failure) data and posit a random distribution of disk failures, and the non-tech answer is: “not very.”

To illustrate, take a modern 400GB SATA drive with an MTBF spec of 400,000 hours. In a six drive RAID 5, like the one above, you would expect a drive failure once almost every 67,000 hours (400,000/6). Since there are only 8,760 hours in a non-leap year, that is about every 7.5 years. So no worries, eh?

Sorry, yes, there are worries, of two different types:

  • First, what if the drive failures are not random? In my experience they frequently are not. Bad power, poor cooling, heavy duty cycles, shock and vibration problems, all come together to produce unexpected failure clusters. Even with a good environment, there will be clusters of failures simply as a function of statistical variation. So the random failure assumption is not always valid.
  • Second, the problem of read failures. As this note in NetApp’s Dave’s Blog explains, complete disk failures are not the only issue. The other is when the drive is unable to read a chunk of data. The drive is working, but for some reason that chunk on the drive is unreadable (& yes, drives automatically try and try again). It may be an unimportant or even vacant chunk, but then again, it may not be. According to Dave’s calculations, if you have a four 400GB drive RAID 5 group, there is about a 10% chance that you will lose a chunk of data as the data is recovered onto the replacement drive. As Dave notes, even a 1% chance seems high.

Where Dave and I part company is in our response to this problem. Dave suggests insisting on something called RAID 6, which maintains TWO copies of the recovery data. Compared to our RAID 5 example above, this means that instead of having 2000GB of usable capacity, you would have 1600GB. And now RAID 1 would only have 25% less capacity. I say drop RAID 5 and 6 and go to RAID 1+0, which is both faster and more reliable.

RAID 5 and 6 use much more complicated software to create the recovery data in the first place, and then after a disk fails they need to read each of the remaining disks along with the recovery data to re-create the lost data. For large disks in large RAID groups this can take many hours, if not days. And while the recovery is underway your storage performance is hosed.

My point is, why even go there? Why not just maintain two complete copies of your data, so when a failure occurs, as it inevitably will (and at the worst possible time, of course) your data is just copied from one disk to another at disk-to-disk speed?

Small and medium businesses face enough uncertainty as it is. Spending a few extra bucks for RAID 1 or 1+0 will make your local digital data storage as bulletproof as it can be. Isn’t that what you really want?