Already did the outside skinny
I talked to Sujal Patel, founder and CTO of Isilon, last week to learn more about Isilon. As I’d mentioned in my posts on the company I was surprised that there wasn’t more technical information about the products on their website. On the other hand, I believe that may be a wise decision, since my experience is that if you talk too much about the “how” prospects tend to forget about the “what”. And the “what” is what sells.
But I like the “how”!
So Sujal agreed to a bit about the how and to respond to some concerns about the Isilon architecture with me. I also got some business info as well. I’m supplementing Sujal’s comments with a bit of information from SEC filings and RBC Capital Markets research.
Isilon today has over 300 customers, 88 added just last quarter. They’ve been recruiting resellers heavily, which makes sense to me since with such an easy-to-manage product they don’t need the costly SE hand holding that most SANs requires.
Rain city veterans
Sujal started Isilon in Seattle about the same time as YottaYotta, where I ran marketing for several years. 2000 was a tumultuous year: Gilderesque predictions of bandwidth nirvana; rampant easy money delirium; overheated visions of massive internet investment. That both are still around is a minor miracle. Sujal said that for Isilon, remarkably, the goals have stayed the same. The product intended for media servers and other large-file, mostly sequential workloads. They kept at it and the business developed. They didn’t allow themselves to get distracted by the uproar of the dot-bomb crash, stayed focused, and the rest is history.
One measure of their success: 50% of their business is from repeat customers. A couple of very large customers – Kodak and Comcast – account for less than 20% of their sales, but as they bring on new customers their importance is declining.
Down is the new up
Their move downmarket, with the new IQ200, a 3-node cluster listing for less than $40K was, Sujal reports, driven by the channels focus. Not only does that not surprise me, but I’d guess there will be an under $20k IQ100 late this year. Three node VAXclusters were the most popular in the 1980s due to the functional redundancy. Lose one, still have two-thirds of your processing power and some redundancy. I don’t think that dynamic has changed.
Isilon also has very good gross margins – over 50%. Not as good as the products I’ve marketed, but hey, they’re young. I love to see vendors making good margins while providing good value: they’ll be around. Buy with confidence.
Now for the technology
Isilon’s technology is more complicated than the Google File System, which is my favorite model of a bare bones, low-cost, scale-out cluster storage system. Isilon’s system is fully distributed, which is more elegant than Google’s master/server architecture, yet adds overhead. Isilon supports RAID5-like functionality, Google nothing but file replication.
With Isilon you can choose n+1, n+2, n+3, n+4 redundancy on a per-file basis at a cost of 20% increments of capacity usage. If a node fails, the system uses its Virtual Spares to maintain the requested redundancy.
Linear, let’s get linear
A fully distributed system needs to do a lot of message passing to keep everyone coordinated. These messages are small, but their latency is a problem. That is why their no-cost adoption of Infiniband is smart: the under 100 ns latency is a real performance boost. They handle subnet management in their software, so customers don’t need to be Infiniband gurus to get it working.
Sujal says that messaging traffic grows linearly as you add nodes. There seems to be a couple of reasons for that. First, even in a large cluster there are typically no more than 16 nodes involved in any one I/0. Second, Isilon uses three meta-data “authorities” for each file. So while the data may be spread from here to eternity, the metadata isn’t, reducing coordinating traffic required to handle updates and cache invalidation.
Another strategy for reducing latency in a fully distributed system: NVRAM. Isilon uses battery-backed NVRAM cards to eliminate disk writing latency. Each node can safely acknowledge a write in microseconds instead of waiting milliseconds for a disk I/O to complete. Sweet.
96 nodes to the tune of “96 Tears”
Isilon currently has a 96 node limitation, which I found interesting because VAXclusters also topped out at 96 nodes. Made me wonder if there was something mystical about the number 96. Sujal says no, it is a testing limitation. They started with 12 nodes and have gradually raised the number to 15, 32, 35, 88, and now 96 nodes. It takes a lot of space and energy to set up to test that many nodes, even if you figure out the financial implications – would you want to sell 96 nodes as used equipment?
The StorageMojo take
Isilon has cool technology and a lead in a new market segment. They’re focused, smart, funded and growing fast. Are they invulnerable? Not even close. But the big iron vendors need to think seriously about the meaning of disruptive technology.
Isilon might be an object lesson.
Comments welcome, as always. Moderation enabled because moderation is a virtue, except in the defense of liberty.