It didn’t win Best Paper honors at FAST 08 – IIRC it was An Analysis of Latent Sector Errors in Disk Drives (the link is to the StorageMojo review of that excellent paper last month) but I really like the thinking behind Pergamum: Replacing Tape with Energy Efﬁcient, Reliable, Disk-Based Archival Storage.
Written by Mark W. Storer, Kevin M. Greenan, Ethan L. Miller (UC Santa Cruz) and Kaladhar Voruganti (NetApp) the paper discusses a prototype that
. . . is a distributed network of intelligent, disk-based, storage appliances that stores data reliably and energy-efﬁciently. While existing MAID systems keep disks idle to save energy, Pergamum adds NVRAM at each node to store data signa- tures, metadata, and other small items, allowing deferred writes, metadata requests and inter-disk data veriﬁcation to be performed while the disk is powered off.
They call the appliances tomes.
Tape: where data goes to die
One of tape’s big advantages is that it uses no power at rest. Any disk-based tape replacement will have to come as close to the same ideal.
The tomes use a single hard drive, an ARM-based processor board with NIC and NVRAM. Total power use – when powered up – about 11.5 watts, less than 15k FC drive. With tighter code, a slower drive and more integration, I’d bet they could cut that in half.
The single disk drive means that tomes must be used in groups to enable distributed RAID techniques and exchange of algebraic signatures to ensure inter-disk recovery. The paper goes into those techniques in detail.
The purpose of the NVRAM is to provide low-power, persistent storage; operations such as metadata searches and signature requests do not require the unit’s drive to be spun up.
. . . the NVRAM primarily holds metadata such as algebraic signatures and index information, ﬂash writes are relatively rare; ﬂash writes coincide with disk writes.
The Ethernet interconnect is important – by using cheap unmanaged switches for fan out, high aggregate bandwidth, exceeding that of current tape libraries, is easily and inexpensively achieved. The use of power-over-Ethernet would further reduce costs, especially if the system used 4200 RPM drives.
The StorageMojo take
Most of the disk vs tape discussions look at the disk device vs tape cartridge cost issue – and they aren’t that different even today. But the tape library market is a $4-5 billion market. A disk-based alternative to slow tape libraries could take a big chunk of that.
Further, this design could be integrated into a single disk controller board, creating a disk with a single Ethernet port and incredible packaging and manufacturing economies.
If Seagate were smart they’d jump on this. This is a major opportunity to drive another significant consumer of disk drive units – without encroaching on existing OEM customer businesses. That doesn’t happen very often.
Comments welcome, as always. Pergamum was an ancient Greek city known for its sizable library, second only to the library of Alexandria.