I wrote this as a comment originally, and since I am sitting here at a bar that claims the World’s Largest Selection of Draft Beer I realized it would make a good post, where “good” is a measure of speed. I am in a race against time: my blood alcohol level; ability to write; battery life; and my desire to sample as many fine Belgian beers as possible. So pardon my recycling.

There Are Worse Things Than Statistics
To paraphrase, there are lies, damn lies and storage performance numbers – I hesitate to call them statistics for fear of giving statistics an even worse reputation than they’ve already got.

This came up when some devoted readers questioned the benchmarks used for the ZFS vs Hardware RAID numbers. I didn’t dig into the benchmarks (those Unix system results may mean something to you, but they give me a headache) Robert used for his test to see what the mix of I/O sizes are. The typical strategy for describing array performance is to run a test that will give the absolute best possible number for the attribute one is measuring.

Surprise: Vendors Use The Absolute Best Numbers They Can Somehow Justify
For bandwidth that means reading and writing really big files – which is fine if you are doing video production or 3D seismic analysis – and totally irrelevant for almost all common workloads. For IOPS numbers that means the smallest possible I/Os as fast as possible – which usually means everything is sitting in cache. While that is nice when it happens, that is also an unlikely event in the real world.

Welcome To The Unreal World
So other than storage marketing people being lying scum, what is the point of benchmarks that only reflect un-real-world performance? Consider all storage benchmarks as simply telling you what the absolute maximum you could ever see in that metric – the vendor’s guaranteed absolutely “will never exceed” number. If you have good reason to believe you’ll need more than that then be afraid – be very afraid.

So What’s Left?
In the real world, with a mix of I/O sizes and rates, you’d be shocked at what “performance” looks like. Running 2k I/Os on the biggest Sym or Tagma you can imagine – ’cause you certainly can’t afford it! – on dozens of servers across multi-dozen FC’s and I suspect you’d see, maybe, with luck, 100MB/sec of bandwidth. A 3ware controller would probably do single digits. No bad guys here, this is just the nature of the storage I/O problem.

The Industry’s Storage Performance Secret Decoder Ring
Like the all-too-obvious breast implants some women favor these days, storage performance numbers reflect what the industry thinks practitioners want, while practitioner’s admiring glances confirm how willing we are to be entertained by a polite fiction. Like a good action movie, we know it isn’t real, but it gets our heart pumping anyway. So what is really true about performance?

Roughly:

  • Latency is usually lower with a smaller array, since you don’t have millions of lines of code and multiple switch operations to traverse
  • IOPS scale for large arrays mostly as a function of parallelism – more I/O ports, more I/O processors, more cache, more interconnect bandwidth, more spindles – not because each individual I/O unit is blindingly fast
  • There are only a few vendors of most of these components, so the big arrays are built out of commodity parts. Architecture and firmware are the major differentiators. So, for example, cache access times are fairly constant unless using expensive static RAM. FC chips come from what, two vendors? Microcontrollers from four? Disks from three? What do you expect?

Of course the price-point engineered stuff will be slow. But I bet there is little difference in per-port performance between an enterprise modular array and the big iron Sym’s and Tagmas.

I’ve never seen a direct comparison of single FC port performance across big iron and modular arrays, which also suggests that it isn’t all that different. If you have data that suggests otherwise I encourage you to post it. I’d love to be proven wrong.

OK, The A. V. Brother David’s Triple Has Kicked In, So What’s The Point?
To me, the point of the ZFS benchmarks is not the absolute numbers, which are respectable for either case, but that the money spent for the RAID controllers bought nothing. I’d argue that even if the software were 20% slower, you’d still want to lose the hardware RAID and its associated bugs, power consumption, cost and maintenance.

Dateline: The Yard House, Long Beach, California. And yes, I love Belgian beers and ales. As well as their chocolate. Don’t get me started on how badly Belgians market their great little country. Did I tell you about the bar in Bruges that has over 400 Belgian beers and ales on tap? Maybe another time. . . .