“There is not a lot of added value in commodity ‘storage bricks’”
Commented one feisty StorageMojo.com reader last week. I didn’t agree, but I didn’t have a ready answer, either. But now I do: Rackable Systems.
You may have heard of Rackable for their innovations in packaging systems: efficient DC power; half-depth servers mounted back-to-back with a central “chimney” for cooling; remote management; and a rapidly growing storage business.
What you didn’t know
RACK is profitable. They’ve been growing like a weed, doubling in size each of the last three years and are on track to do it again this year. They have sales of $1.9 million per employee, which is likely close to a Silicon Valley record. All this on gross margins in the low 20’s, just a few points higher than Dell.
It isn’t all hardware either
One of the fastest growing parts of their business is storage and their Terrascale clusters. They claim:
The Terrascale architecture is free of any serialized function that would limit performance scalability. . . . Terrascale software uses a lightweight, linearly scalable, on-demand cache coherency algorithm that guarantees that servers access the correct representation of any data block at any point in time.
I’ve dug into their white paper – which is better than most – called The TerrascaleTM Storage Cluster: A New Paradigm for Parallel I/O to Resilient
Network Storage. It isn’t clear how they do all their magic from the paper. Nonetheless they are insistent in claiming that the system truly scales to hundreds of nodes. Here’s a precis of what I was able to glean.
Terrascale offers a:
- Global name space
- Global lock management service
- Local cache coherence mechanism
The global name space means all the servers in the cluster see all the same files. The lock management ensures that data is written only when safe. The local cache coherence means that all servers know immediately when data is written, thanks to a write-through cache.
iSCSI to the rescue
Using open source software, Terrascale adds an iSCSI target kernal module that, in concert with a client iSCSI initiator, creates this highly parallel infrastructure. Above the Terrascale layers are standard Linux storage tools such as lvm, while below is the standard TCP/IP stack.
RAID upon RAID
The global name space means that all storage is part of a pool. RACK offers low-cost RAID 5 storage and then use those as virtual disks to create a second layer of RAID across those for greater speed and availability. They claim their storage scales to the limit of aggregate network or storage bandwidth. Need more of either? Buy more and plug it in.
The StorageMojo.com take
RACK is currently focused on the high performance computing market where their pack ’em dense, stack ’em high and sell ’em cheap model is a hit. Yet it won’t take them too much longer before they will have to look to commercial markets for high growth. And there they have an excellent opportunity to make waves for conservative storage vendors – unless EMC or IBM buys them first – with their low margins and aggressive prise/performance. A company to watch and, if you’re in the market for more Mojo, a company to look at buying from.
Comments welcome, as always. Moderation turned on to keep the comment spam under control.
Robin,
The ‘added value’ comment was hardware-related i.e. stating the need for ‘well-designed’ standardized storage bricks, reflecting reduced cost, lower power and increased reliability.
I was hoping that you would ‘find’ Terrascale, and with some more research, you and your readers will find their technical background very interesting.
This is an excellent example as to what can be done, in a short period of time, with brilliant technical strategy and knowledge, assisted by open source and with very little funding.
I think that their storage ‘brick’ will end up being an ‘improved’ version, to fit the overall packaging/power concept at Rackable.
As far as Rackable is concerned, this acquisition greatly reinforces their already successful sales model and the existing power & packaging related IP portfolio… which IMHO … can also be dramatically improved, with little effort.
With some additional work on the ‘transactional’ I/O front, they should also be able to deliver totally integrated hardware solutions to the DB segment, much as they do now in scientific clusters.
It may be interesting to consider how a stronger relationship with Red Hat could contribute to this.
Sure, this is a very good example of what you have been saying in some of the topics…. open source and COS included. It proves that you do not need to be EMC or IBM to do it … but don’t wake them up, just yet.
Hi Robin,
We’d written a paragraph on TerraScale last year on our Wiki and I thought I might share our interpretation of it. The storage brick would now be the Rackable storage server. But all else remains the same I think.
The big issue to keep in mind is that this solution forces the client to install a loadable kernel module on the client to use the storage. This is probably OK for HPC applications but not really suited for the enterprise. Panasas, Lustre, and others are doing similar things and all require a custom client to achieve their scalability.
Best.
Cameron
—
What they do is sell a Storage Brick appliance that is an iSCSI target. They then give you a loadable kernel module for Linux that is an iSCSI initiator, so each host sees each Brick as a SCSI drive. You can then use standard LVM to make a RAID stripe across the Bricks. At that point you can use the RAIDed device for a database or format it for a filesystem.
To deliver a global file system, you can install a kernel module that provides a their modified ext3 or XFS. They’ve tweaked it so that some cache invalidation occurs when other clients make modifications. Their goal was that you should be able to use standard tools for ext3 or XFS, they don’t modify the on-disk layout of those filesystems.
I wouldn’t compare them with Isilon and Exanet – those are not HPC products in that they double hop data to the client. Single hop would be PolyServe, Ibrix, Redhat GFS, and Panasas. They argue that SAN filesystems don’t scale too well due to node cluster traffic (there is no intelligence in the SAN itself). Since they combine some intelligence into their storage Brick as well as the client, they have minimal cluster traffic and scale really well.
Well, at least I don’t see specific mention of Clustered Namespace technology (whew)…
Blog entry on Cluster Namespace (OnTap GX)
“Terrascale software uses a lightweight, linearly scalable, on-demand cache coherency algorithm”
Sounds like magic!
Guys,
Really appreciate the added technical background. As I said their whitepaper leaves some details out.
It is the nature of disruptive technologies (Christensen’s term) or disruptive products (my term) that they don’t do what existing do – they do something else, usually too trifling to be of concern to the big guys. With stepwise enhancement and a focus on what their customers want they can gain a lead in just a couple of years that is hard to erase.
Maybe RACK is in that sweet spot.
Robin
Regarding the linear scale magic, Terrascale has 2 patents pending – 20050108300 and 20050108231. See:
http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PG01&s1=terrascale&OS=terrascale&RS=terrascale
http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=2&f=G&l=50&co1=AND&d=PG01&s1=terrascale&OS=terrascale&RS=terrascale