I wrote this as a comment originally, and since I am sitting here at a bar that claims the World’s Largest Selection of Draft Beer I realized it would make a good post, where “good” is a measure of speed. I am in a race against time: my blood alcohol level; ability to write; battery life; and my desire to sample as many fine Belgian beers as possible. So pardon my recycling.

There Are Worse Things Than Statistics
To paraphrase, there are lies, damn lies and storage performance numbers – I hesitate to call them statistics for fear of giving statistics an even worse reputation than theyâ€™ve already got.

This came up when some devoted readers questioned the benchmarks used for the ZFS vs Hardware RAID numbers. I didnâ€™t dig into the benchmarks (those Unix system results may mean something to you, but they give me a headache) Robert used for his test to see what the mix of I/O sizes are. The typical strategy for describing array performance is to run a test that will give the absolute best possible number for the attribute one is measuring.

Surprise: Vendors Use The Absolute Best Numbers They Can Somehow Justify
For bandwidth that means reading and writing really big files – which is fine if you are doing video production or 3D seismic analysis – and totally irrelevant for almost all common workloads. For IOPS numbers that means the smallest possible I/Os as fast as possible – which usually means everything is sitting in cache. While that is nice when it happens, that is also an unlikely event in the real world.

Welcome To The Unreal World
So other than storage marketing people being lying scum, what is the point of benchmarks that only reflect un-real-world performance? Consider all storage benchmarks as simply telling you what the absolute maximum you could ever see in that metric – the vendorâ€™s guaranteed absolutely â€œwill never exceedâ€ number. If you have good reason to believe youâ€™ll need more than that then be afraid – be very afraid.

So What’s Left?
In the real world, with a mix of I/O sizes and rates, youâ€™d be shocked at what â€œperformanceâ€ looks like. Running 2k I/Os on the biggest Sym or Tagma you can imagine – â€™cause you certainly canâ€™t afford it! – on dozens of servers across multi-dozen FCâ€™s and I suspect youâ€™d see, maybe, with luck, 100MB/sec of bandwidth. A 3ware controller would probably do single digits. No bad guys here, this is just the nature of the storage I/O problem.

The Industry’s Storage Performance Secret Decoder Ring
Like the all-too-obvious breast implants some women favor these days, storage performance numbers reflect what the industry thinks practitioners want, while practitioner’s admiring glances confirm how willing we are to be entertained by a polite fiction. Like a good action movie, we know it isn’t real, but it gets our heart pumping anyway. So what is really true about performance?

Roughly:

Latency is usually lower with a smaller array, since you donâ€™t have millions of lines of code and multiple switch operations to traverse
IOPS scale for large arrays mostly as a function of parallelism – more I/O ports, more I/O processors, more cache, more interconnect bandwidth, more spindles – not because each individual I/O unit is blindingly fast
There are only a few vendors of most of these components, so the big arrays are built out of commodity parts. Architecture and firmware are the major differentiators. So, for example, cache access times are fairly constant unless using expensive static RAM. FC chips come from what, two vendors? Microcontrollers from four? Disks from three? What do you expect?

Of course the price-point engineered stuff will be slow. But I bet there is little difference in per-port performance between an enterprise modular array and the big iron Symâ€™s and Tagmas.

Iâ€™ve never seen a direct comparison of single FC port performance across big iron and modular arrays, which also suggests that it isnâ€™t all that different. If you have data that suggests otherwise I encourage you to post it. Iâ€™d love to be proven wrong.

OK, The A. V. Brother David’s Triple Has Kicked In, So What’s The Point?
To me, the point of the ZFS benchmarks is not the absolute numbers, which are respectable for either case, but that the money spent for the RAID controllers bought nothing. Iâ€™d argue that even if the software were 20% slower, youâ€™d still want to lose the hardware RAID and its associated bugs, power consumption, cost and maintenance.

Dateline: The Yard House, Long Beach, California. And yes, I love Belgian beers and ales. As well as their chocolate. Don’t get me started on how badly Belgians market their great little country. Did I tell you about the bar in Bruges that has over 400 Belgian beers and ales on tap? Maybe another time. . . .

4 Comments

Josh Maher on Friday, 18 August, 2006 at 4:27 pm

Very well put……

Keep in mind that the differentiator is also in what components are accepted from the manufacturer, what custom components their mixed with, and what code is then placed on top of them.

The raw FC comparison most likely would turn up the same, as would a raw disk comparison but if one vendor accepts lower quality batches to reduce cost, or otherwise doesn’t take full advantage of the potential raw performance, that platform will ultimately be slower or prone to more failures.
John on Friday, 18 August, 2006 at 7:55 pm

Hi, Robin, I am glad you had fun with beer. Just drive carefully and don’t say znything if caught in Malibu …

Anyway back to RAID and ZFS and future GFS, from business point of view, why are RAID type systems needed? For availability mainly, I guess. But then why till now only RAID-5 is almost the one in use? I know there are couple of RAID-6 vendors, but not many are buying RAID-6. What is the reason? Is it because RAID-5 already provides enough availability so there is not much need for RAID-6 (or n>6) or other reasons? Or because of performance issues? Or return on investment? I know quite a lot about RAID technology, but just don’t know if there is (will be) big market for RAID-n(>5). ZFS/GFS type of things are going beyond RAID-5 mostly by mirroring. So there is a gap here. What do you say?
Josh Violette on Monday, 21 August, 2006 at 11:33 am

There’s a Yard House on Tatum off 101 and another being built near the Cardinals stadium. Always worth a visit. Try Tetley’s & Boddington’s for fine English ale.
Robin Harris on Tuesday, 22 August, 2006 at 2:24 pm

Josh,

Thanks for the tip about the new Yard Houses in Phoenix. A man can get mighty parched out here in the techno-blog desert. Yet if memory serves, the Long Beach Yard House claims some 250 taps, while their other stores are about 150. It will be interesting to see where the Phoenix stores come in.

I’ve drunk lots of Boddingtons and Newcastle Brown Ale and enjoy them very much. I’ve also spent many an enjoyable hour in Munich’s beer gardens sampling Bavaria’s finest. Yet to me the Belgium has the richest beer-making tradition in the world and the products to back it up.