Update: Read the entire article on one page here.

IDC-scale Storage Management
We’ve seen how IDC storage practices follow the dynamics outlined by Gray and Shenoy. Yet one of the most interesting questions to practitioners has to be storage management. Yes, huge disk farms can be built, but how are they managed? Rules of Thumb in Data Engineering makes many interesting statements with implications for management, yet few prescriptions. I’ll explore some of those statements and then offer my own conclusions. You may have different ones, which I’d like to hear.

RAM, Disk and Tape
The key rules of thumb affecting storage management are:

  • Storage capacities increase 100x per decade
  • Storage device throughput increases 10x per decade
  • Disk data cools 10x per decade
  • In ten years RAM will cost what disk costs today
  • NearlineTape:OnlineDisk:RAM storage cost ratios are approximately 1:3:300
  • A person can administer a million dollars of disk storage

The first four are exponential functions, and since humans are bad at estimating exponential effects, let’s explore these further.

Storage Capacities Increase 100x Per Decade
Today, in 2006, 750 GB is highest capacity disk drive available. This rule means that in 2016 the largest disk drive will 75 TB. The largest laptop drive today is 160GB. In ten years the largest laptop drive will be 16TB.

One implication of this is that capacity is cheap and getting cheaper faster than any other storage technology except RAM. Therefore, a storage solution engineered for the coming decade will spend capacity rather than accesses, just as several of the IDCs do today.

Storage device throughput increases 10x per decade
Disk data cools 10x per decade
These rules are arithmetical inverses. If capacity increases 100x, and throughput 10x, then accesses per byte must drop by 10x. Over time, as Gray and Shenoy note, disk becomes the new tape. The cost of an access keeps climbing, the cost of capacity keeps dropping and eventually the storage is only good as an archive, unless the usage model is adjusted to reflect the change.

In ten years RAM will cost what disk costs today
What saves disk from tape’s fate is cheap RAM. Everything on disk today will be in RAM. Disk will be saved for stuff we hardly ever look at.

Ten years from now the average laptop will have a 6-8 TB disk drive and 60-80 GB of RAM. Loading the OS at 2-3 GB/sec will take about 10 seconds. Loading all your favorite tunes, apps, documents and movies will take another 30 seconds. You’ll be happy, even though disk accesses are more costly than ever.

NearlineTape : OnlineDisk : RAM storage cost ratios are approximately 1:3:300
In the paper, Gray and Shenoy note

Historically, tape, disk, and RAM have maintained price ratios of about 1:10:1000. That is, disk storage has been 10x more expensive than tape, and RAM has been 100x more expensive than disk. . . . But, when the offline tapes are put in a nearline tape robot, the price per tape rises to 10K$/TB while packaged disks are 30K$/TB. This brings the ratios back to 1:3:240. It is fair to say that the storage cost ratios are now about 1:3:300.

The disk to nearline tape costs will inevitably reach parity despite the hard work of their engineers. Why? Because the costs of mechanical robots are only declining 2% (if that) annually while disk costs are declining at 40-50% annually. Even if tape was on the same price/capacity slope as disks – which they aren’t – the mechanicals increasing share of cost puts nearline at a permanent disadvantage. It isn’t a question of if the cost lines will cross, only when. Fortunately for StorageTek, Quantum, et. al., storage consumers are a conservative bunch, and tape buyers are the most conservative of all. It will take 10-15 years for word of tape’s obsolescence to spread.

A person can administer a million dollars of disk storage
Of all of Gray’s and Shenoy’s rules this is the one I find the most interesting. In discussing it they note

The storage management tools are struggling to keep up with the relentless growth of storage. If you are designing for the next decade, you need build systems that allow one person to manage a 10 PB store.

Now, that number is from six years ago. I suspect that if they revisited that recommendation today, they would start from today’s big systems. 100x in 10 years is the same as 10x in two successive five year spans. The biggest arrays today are about one PB and presumably may be managed by one person (feel free to correct me on that point). So in five years, 10 PB seems reasonable. Yet in 10 years, if the 100x rule holds, one person will manage 100 PB. So my question to readers is: are any of today’s storage management paradigms scalable to 100 PB in 10 years? To 10 PB in five years?

Conclusion:
Storage’s secular trend is to move intelligence from people to CPU and then to storage. With early drum storage, programmers hand-optimized the data layout on the drum to reduce rotational latency. With the RAMAC, the CPU controlled the disk drive’s arm movement directly. Over time many functions, such as mirroring, moved from person-to-CPU-to-storage. Storage has already achieved a degree of automation and virtualization far beyond what practitioners imagined even 20 years ago.

Looking at the rules of thumb in Gray and Shenoy’s paper and comparing them to what we know about IDC’s – mostly Google, since they’ve been the most open – suggests that the rules are essentially correct. More importantly, not following the rules as one scales up is costly in equipment and labor.

Enterprise Data Centers (EDCs) don’t have the growth rate of a Google or Amazon, so they don’t face the stark choices that come with triple digit growth. Yet it is clear that today’s EDC architectures will not scale indefinitely, either technically or economically. Smart EDC architects and CIOs will start thinking sooner, rather than later, about how to start adapting IDC architectures to EDC problems.