Update: Read the entire article on one page here.
Introduction: Internet Scale vs Enterprise Scale
The rise of the Internet Data Center (IDC), such as Amazon, Google and Yahoo, in the last 10 years is perhaps the fastest and most radical transformation in information technology’s history. From humble beginnings using standard enterprise style architectures and products, the IDC has rapidly evolved into a form whose implications are only slowly being understood by the rest of the IT sector, vendors and customers alike. Soon IDCs will dwarf the Enterprise Data Center (EDC) not by 10x, but by 100x or even 1,000x in any metric that matters.
Nor is the IDC a muscle-bound, one trick giant. Though huge the IDC is lithe and flexible. A “state of the art” EDC struggles to architect, implement, provision and manage new applications in three year projects for a few thousand users, IDC’s routinely roll out to millions and even tens of millions of users in months. An IDC is, first and foremost, an application delivery engine.
Of course, as the old IT joke goes, God created the universe in seven days, but He didn’t have an installed base. And neither do the IDCs, today. Yet even there the IDCs have important lessons to offer as we create the 21st century infosphere. These lessons just aren’t always obvious. If they were, everyone would have figured them out by now.
Not The Same As The Old Boss
IDCs differ from EDCs in many ways. The question is why. The answer can tell us much about how we can expect to see IDCs evolve as well as how to re-think the EDC. Is it the economics? The applications? The technology? The scale? Or is something else behind it?
Some of the hardware elements that differentiate IDCs from EDCs are:
- No big iron RAID arrays
- No Fibre Channel
- Direct-attached disk (with the exception of Yahoo Mail)
- Many small, low-cost servers
This architecture is a startling contrast to the standard big-iron array-and-server architectures heavy on Fibre Channel, RAID, mainframes and servers in the $15k and up range. In this series of articles I’m going to look at a few of these differences to see what is behind them, using one of my favorite papers as a guide.
Finally, A Crystal Clear Crystal Ball
In an excellent six year old paper Rules of Thumb in Data Engineering Jim Gray and Prashant Shenoy offered a set of useful generalizations about data storage design. How useful? Let’s see what they can tell us about IDC architecture.
I/O Cost and System Architecture
Bits want to be free, but I/Os aren’t and never will be. An IDC may be thought of as a massive I/O processor: data streaming in from bots; requests from customers; search results, ads, email, maps, etc. flowing out. Sure, computation is needed to find and sort the content. Yet the raw material is the many petabytes each IDC stores and accesses.
So the cost of an I/O, in CPU cycles and overhead, is important. Gray and Shenoy derive some rules of thumb for I/O costs:
- A disk I/O costs 5,000 instructions and 0.1 instructions per byte
- The CPU cost of a Systems Area Network (SAN) network message is 3,000 clocks and 1 clock per byte
- A network message costs 10,000 instructions and 10 instructions per byte
[For some reason, the authors switched metrics from instructions to clocks. I’m assuming, conservatively, that 1 instruction = 1 clock. The authors note elsewhere that the clocks per instruction (CPI) actually ranges from 1-3.]
So for an 8KB I/O, which is a standard I/O size for Unix systems, it costs
- Disk: 5,000 + 800 = 5,800 instructions
- SAN: 3,000 + 8,000 = 11,000 clocks
- Network: 10,000 + 80,000 = 90,000 instructions
Thus it is obvious why IDCs implement local disks in general preference to SANs or networks. Not only is it cheaper economically, it is much cheaper in CPU resources. Looked another way, this simply confirms what many practitioners already have ample experience with: the EDC architecture doesn’t scale easily or well.
Comments – moderated to eliminate spam – always welcome. BTW, the numbers in the paper are six years old. More current estimates much appreciated.
Next: Storage Cost and Implementation
The reason we switched from instructions to clocks was explained somewehere in there — Amdahl did it in terms of instructiosn, but with super-scalear piplelined machines the only thing you can easily measure is clocks (and CPI varies from 1-4 in modern processors for vaious workloads and clock speeds).
The complication is that FC and now iSCSI have TOE (tcp offload engines) that move the instrucitons to the adapter (much as SCSI controllers do) so the real downside of SAN is cost of specialize hardware (low volume hardware compared to Direct Attach Disk) and management complexity of SAN.
Ah, the clock vs instruction set issue so simple when you explain it. Thanks.
Specialized hardware is, in my view, like aspirin: good for temporary relief. If it isn’t going to go into volume production, it will rarely either get the price/performance or the management support that makes it usuable for the “many-small” environment I’ll be writing about next – despite what I said in Pt II.