StorageMojo is focussed on new storage technologies, products, companies and markets. And where do new technologies come from? From people researching at the limits of the known.

That’s why StorageMojo attends the Usenix File And Storage Technology (FAST) conference every year. Top academics – many with corporate ties – and grad students present the latest research.

Many papers are submitted and reviewed before some are chosen for presentation at the conference over two and a half days. Here are StorageMojo’s favorites from FAST 13, in no particular order:

SD Codes: Erasure Codes Designed for How Storage Systems Really Fail by James S. Plank, U of Tennessee, and Mario Blaum and James L. Hafner of IBM Research. RAID systems are vulnerable to a disk failures and unrecoverable read errors, but RAID 6 is overkill for UREs. The paper investigates lighter-weight erasure codes – disk plus sector, instead of 2 disks – to reclaim capacity for user data.

The StorageMojo take: high update costs make this most attractive for active archives, not primary storage. The capacity savings could extend the economic life of current RAID strategies vs newer erasure codes.

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage by Ji-Yong Shin and Hakim Weatherspoon of Cornell, Mahesh Balakrishnan of Microsoft Research and Tudor Marian of Google. The limited I/O performance of disks makes contention a persistant problem on shared systems. The authors propose a novel log structured disk/SSD configuration and show that it virtually eliminates contention between writes, reads and garbage collection.

The StorageMojo take: SSDs help with contention, but they aren’t affordable for large-scale deployments. Gecko offers a way to leverage SSDs for log-structured block storage that significantly improves performance at a reasonable hardware cost.

Write Policies for Host-side Flash Caches by Leonardo Marmol, Raju Rangaswami and Ming Zhao of Florida International U., Swaminathan Sundararaman and Nisha Talagala of Fusion-io and Ricardo Koller of FIU and VMware. Write-through caching is safe but expensive. NAND’s non-volatile nature enables novel write-back cache strategies that preserve data integrity while improving performance. Thanks to large DRAM caches, read-only flash caches aren’t the performance booster they would have been even 5 years ago.

The StorageMojo take: Using flash only for reads mean ignoring half – or more – of the I/O problem. This needs to be fixed and this paper points the way.

Understanding the Robustness of SSDs under Power Fault by Mai Zheng and Feng Qin of Ohio State and Joseph Tucek and Mark Lillibridge of HP Labs. The authors tested 15 SSDs from 5 vendors by injecting power faults. 13 of the 15 lost data that should have been written and 2 of the 13 suffered massive corruption.

The StorageMojo take: We may be trusting SSDs more than they deserve. This research points out problems with still immature SSD technology.

A Study of Linux File System Evolution by Lanyue Lu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau and Shan Lu of the University of Wisconsin. The authors analyzed 8 years of Linux file system patches – 5079 of them – and discovered, for instance, that

. . . semantic bugs, which require an understanding of file-system semantics to find or fix, are the dominant bug category (over 50% of all bugs). These types of bugs are vexing, as most of them are hard to detect via generic bug detection tools; more complex model checking or formal specification may be needed.

The StorageMojo take: Anyone building or maintaining a file system should read this paper to get a handle on how and why file systems fail. Tool builders will find some likely projects as well.

Courteous comments welcome, of course. There were some great posters and WIP presentations as well that I hope to write about soon.