Okay, we’ve figured out how to produce protected storage for $100 a terabyte . It has wide fan out so the bandwidth is modest. It uses large SATA disks so it isn’t great from an IOPS perspective either.

But it works.

What would it take to turn it into something that the average enterprise could use? Would a high-performance, scalable, high bandwidth, high capacity intelligent cache that automatically moved cool data off to the low cost backing store do the trick?

Several companies are betting it will.

The players
Gear6 has been around for a few years with a commodity-based clustered cache appliance that sits in front of existing filers.

F5 Networks also offers front end “intelligent file virtualization” with their ARX device.

Now a couple of new players are going public:

Avere Systems
Avere Systems is announcing their FXT cluster, an appliance made of storage bricks that each include RAM, SSD or flash, and 15 K. disks. The FX team cluster supports NFS and CIFS.

Tiering within the FXT cluster is automatic by access pattern, frequency and type of data. The data is tiered on-the-fly, with a hot file striped across multiple FXT servers, while cool data is pushed out the back end to file storage.

The FXT boxes support both GigE and 10GigE. In their testing the Avere team and its beta sites have found that for every 50 I/Os to the FXT cluster there is one I/O to the backend filers.

Avere is announcing this week with 2 2U rack mount nodes. Performance Go and 20 3K ops per second on a single node using the spec SFS 08 benchmark where the bandwidth of 1 GB per second on reads and 325 MB per second writes per node. They say they have achieved linear performance scaling to 25 nodes and their 1.0 release.

StorSpeed
StorSpeed says it is delivering the world’s first application-aware caching solution. Like Avere, Storspeed is offering a clustered front-end cache, but with extremely high performance: 1 million IOPS in a 3 node cluster; and 10GigE wirespeed bandwidth.

They use deep packet inspection to understand and manage traffic and capacity tiering. Expect more data on their web site when they announce later this week.

The StorageMojo take
These large scale, high performance caches are a logical extension of the disk controller model to network storage. Most data is rarely accessed but is too valuable to off line, thus the rationale for tiered storage.

Where tiered storage fails in practice is the intelligence required to put data in the right place: people just aren’t scalable enough to manage it. These caches bring extra intelligence to the problem of automated data movement without forcing wholesale rip-and-replace of existing infrastructure.

Enterprises can save many millions of dollars by keeping the mass of cool-to-cold data on cheap storage while keeping the hot working set on a smart cache. This could be the dawn of a new tier of storage.

Courteous comments welcome, of course. I did some work for Gear6 a couple of years ago but have no other business relationships with these firms. Rats!