Companies and practitioners spend billions of dollars a year on RAID to protect against disk drive failure. Yet all the research I’ve seen shows that the most common reasons for data loss are, and always have been, caused by people: accidental file deletion and operator error. Why don’t we spend billions on those problems instead of disk drive failure? We aren’t rational about risk.
Bruce Schneier is the founder and CTO of BT Counterpane security. He is a witty and smart writer about security and security technology and is highly recommended. While reading his recent post Perceived Risk vs. Actual Risk it flashed on me that much of what I find goofy about the storage industry might be explained by Schneier.
Now, I could have done the obvious and called him up and asked him to actually explain it, but what fun is that? Instead, I’m going to apply some of his ideas to storage practice and marketing. Just for the record, many of these are actually the ideas of Daniel Gilbert, a psych prof at Harvard, (but I’m not holding that against him) whose book Stumbling on Happiness talks about why we are bad at predicting the future. A short intro to his work is this charming article If only gay sex caused global warming.
Schneier quotes himself from his book Beyond Fear on some of the common misperceptions:
People exaggerate spectacular but rare risks and downplay common risks. They worry more about earthquakes than they do about slipping on the bathroom floor, even though the latter kills far more people than the former. Similarly, terrorism causes far more anxiety than common street crime, even though the latter claims many more lives. Many people believe that their children are at risk of being given poisoned candy by strangers at Halloween, even though there has been no documented case of this ever happening.
File deletion is equivalent to slipping on the bathroom floor. Why not, for example, put deleted files into the trash for 10 days so you’ll have time to reconsider?
People have trouble estimating risks for anything not exactly like their normal situation. Americans worry more about the risk of mugging in a foreign city, no matter how much safer it might be than where they live back home. . . .
It is difficult to pick out the most likely occurrence from several unlikely choices, or even rank them. Perhaps this explains why so many firms have problems after an incident. They prepared, but not for the incident that actually occurred.
People underestimate risks they willingly take and overestimate risks in situations they can’t control. When people voluntarily take a risk, they tend to underestimate it. When they have no choice but to take the risk, they tend to overestimate it. Terrorists are scary because they attack arbitrarily, and from nowhere. Commercial airplanes are perceived as riskier than automobiles, because the controls are in someone else’s hands — even though they’re much safer per passenger mile. . . .
Back up our precious data to good old tape, where the failure rates range as high as 40%? No problem. Outsource our data archive to Cleversafe or Amazon? A scary thought.
Last, people overestimate risks that are being talked about and remain an object of public scrutiny. News, by definition, is about anomalies. Endless numbers of automobile crashes hardly make news like one airplane crash does. . . . If a lunatic goes back to the office after being fired and kills his boss and two coworkers, it’s national news for days. If the same lunatic shoots his ex-wife and two kids instead, it’s local news…maybe not even the lead story.
Gosh, so what is being talked about these days? Hmm-m. Disk error rates: you need RAID 6! Power density: you need to buy low-power chips! Pick your favorite. It isn’t that these aren’t issues, but we all got along last year without knowing or worrying about them and yet, somehow, now we are. Why?
Comments welcome as usual. Go ahead, take a chance!