A couple of weeks ago StorageMojo learned that a VMAX 20k could support up to 2400 3TB drives, it can only address ≈2PB. Where did the remaining 5 petabytes go?

Some theories were advanced in the comments, and I spoke to other people about the mystery. No one would speak on the record, but here’s the gist of the received wisdom.

Different strokes for different folks
Short stroking is the short and best answer. Short stroking uses the outermost tracks – the fastest, densest, and capacious tracks – to punch up drive performance.

By reducing head shift time and maximizing data transfer rates a short stroked drive gets more IOPS and faster transfers. Wonderful!

But at what cost? The 20k’s numbers serve as a first approximation.

Assuming a max’d out VMAX 20k, but using 80 SSDs, that leaves 2,320 3TB 3.5″ drives, for a raw capacity of 6,960TB. Assuming 8 drive RAID 6 LUNs we get a dual-parity protected capacity of 5,220TB. Taking EMC’s spec of an open system RAID 6 capacity of 2,067TB and dividing that by 5220 gives us 39.6% capacity efficiency, which would use roughly the outer 0.8″ of a 3.5″ platter. That would certainly improve IOPS and transfer rate.

Research indicates that the 2,320 disks are roughly half of the total BOM cost. Thus, if you pay $1.4m (not including software) for a fully loaded 20k, $700k goes for the raw 7PB. Since you only get 2PB usable, you are paying ≈$350k – depending on your discount, of course – per short stroked PB of capacity.

The StorageMojo take
We already knew traditional legacy arrays were expensive. What’s really interesting is that even with short stroking, 15k disks would be hard-pressed to do more than 600 IOPS each, or, generously, 1.4m IOPS. EMC promises “millions of IOPS” from the 20k, so even with short-stroking, it’s likely that much of the system’s total performance comes from its caching and SSDs rather than the costly short stroked disks.

Before you buy your next VMAX or other legacy architecture disk array, take a hard look at the cost of short stroked disks. You can do much better, with less complexity, with an all-flash solution at an equal or lower cost. Not to mention the lower OpEx from reduced floor space, power, cooling and maintenance.

Courteous comments welcome, of course. EMC’ers and others are welcome to offer their perspectives to this analysis. Update: Note that the missing petabytes come with using 7200RPM drives, not 15k drives. End update.