StorageMojo




Robin Harris    


Cleversafe’s dispersed storage network

I had a con call with Chris Gladwin and Russ Kennedy of Cleversafe a couple of weeks ago. They’ve come to market with a product line that seeks to deliver:

  • Massive scalability to meet growing digital content requirements
  • Unprecedented Security and Privacy for critical digital assets
  • Survivability against disasters, dishonesty and time
  • Extremely cost-effective infrastructure compared to traditional methods

That’s a quote from their pitch.

Cleversafe’s product line
Cleversafe, IIRC, started as a software company, but their announced products come in nice rack-mountable boxes. There are 3 of them:

  • CS Slicestor - Dispersed Storage server - $11.3k
  • CS Accesser - Dispersed Storage router - $12.3k
  • CS Manager - Dispersed Storage network manager - $12.3k

The Slicestor is a 1U storage server containing 4 disks. The Accessor slices up the data and distributes it - think slice router. The Manager works out of band to monitor and manage the storage network components.

I assume the pricing includes some room for volume discounts. There is an open-source version (c. 2006) of the software. The company intends to offer a software-only version as well.

Why hardware?
The Conventional Wisdom in VC circles is that tin-wrapped software ramps revenues faster - hey, you’re selling tin + bits - at the cost of lower margins and loss of focus.

Qualifying hardware is non-trivial; so you tend to stay on one platform longer than you should. At liquidity event time, software companies fetch higher multiples, so it may be a net loss. VCs live by the Golden Rule: he who has the gold makes the rules.

What it does
Cleversafe has an iSCSI or block storage interface. It takes the data, slices it into small pieces using Information Dispersal Algorithms and then ships the slices off to storage either locally or around the world.

In the latest version you can specify how many slices the system makes and how many slices are required to rebuild the data. If you have 11 data centers around the world, you can specify that, say, 6 are required to recreate the data.

You could lose access to 5 data centers and still recover. If the local controlling authority busts into 3 or 4 data centers, they get nothing. Pretty cool if you worry about corrupt government officials getting hold of your company secrets.

The company is planning on adding FTP, CIFS and NFS in the fullness of time.

How well it works
Cleversafe claims that given sufficient low-latency bandwidth the dispersed storage is as fast as a local disk. That’s a tall order, but for now I’ll take their word for it.

Who should buy it?
The company is aiming the Dispersed Storage Network at ISPs to offer as a service and multinationals with round the clock operations and critical data.

How it works
Cleversafe uses Cauchy Reed Solomon erasure codes to slice and dice the data. These codes have several advantages:

  • More capacity efficient and failure tolerant than parity codes
  • Doesn’t require a license
  • Code and decode are faster than other stack operations

If you’d like to play with Cauchy Reed Solomon, check out Dr. Jim Plank’s software page which includes

. . . Reed-Solomon coding, Cauchy Reed-Solomon coding, general bit-matrix coding, Reed-Solomon coding optimized for RAID-6, and Liberation coding. The documentation provides some tutorial material on matrix and bit-matrix based erasure coding.

I met the good doctor at FAST, where he was delighted to find that Clevesafe - also a FAST presenter - was using techniques he’d worked on a decade ago.

The StorageMojo take
I’m impressed with what Cleversafe has done. They will look even smarter after EMC’s Hulk/Maui announcement this spring. I suspect they’ll be bought by year’s end.

Kudos to the Cleversafe team.

Comments welcome, of course.

What’s with Isilon?

February 21st, 2008 by Robin Harris in Clusters, NAS, IP, iSCSI

They haven’t reported financials for almost 3 quarters. Their stock is trading at about 20% of its peak. They fired their CEO and put founder Sujal Patel in his place. And NetApp was trying to strangle baby Isilon (see NetApp filers for $1/GB?) in its crib.

Are they goners?

I don’t think so.
I’ve been trying to read the tea leaves on the Peter van Oppen’s decision to join the board earlier this month.

Peter led the tape library company ADIC, also based in the Seattle area, for 12 years until its sale to Quantum. ADIC out-innovated Quantum - saddled with a cranky and slow DLT development group - in libraries and software as well.

If you think the folks who buy storage arrays are conservative, you haven’t sold any tape libraries. It is a tough market and ADIC did well.

So why would van Oppen join a sinking ship?
That’s why I don’t think Isilon is sinking. An external audit team is reviewing Isilon’s accounting to ensure that any financial dirty laundry - say, hypothetically, channel stuffing - gets cleaned up. They’ve been at it for months and must be about done.

The StorageMojo take
Based on the Isilon press release and pure speculation, here’s what I think is going down:

  • Peter exercised some due diligence before accepting the directorship and isn’t terribly worried about the basic health of the company
  • After he gets up to speed on company operations, he assumes the CEO role by July
  • Sujal happily goes back to one of the best jobs in any company: CTO and Founder while the stock climbs in value

However it goes down, getting Peter on board is a real plus. Storage experience is thin in Seattle. Isilon has lots of smart people, but the storage market has many unique wrinkles that networking or software folks take a long time to learn.

Comments welcome, as always. Disclosure: I met Sujal 7 years ago and I’ve done some work for Isilon. I hope they do well.

Isilon increases their IQ

January 28th, 2008 by Robin Harris in Clusters, NAS, IP, iSCSI

Despite being written off for dead . . .
Isilon’s been putting their IPO money to good use: engineering the next gen of their platform that they’ve named the X-series. In the meantime they’ve been adding customers - over 600 so far - and they have 60 customers running the new kit.

Moving from an aging single-core Xeon to a dual-core Xeon - the second core isn’t turned on yet - with faster busses and more cache speeds things up. They claim up to 60% faster performance, 20% less power and heat and 10 GigE readiness. Once they get their software dual-core aware they’ll have another nice boost to offer.

The StorageMojo take
Turning over the platform more rapidly than traditional array vendors do is a good strategy. It keeps the competition off-balance and gives you something new to tell customers. What good is commodity hardware if you don’t follow Moore’s law?

That said, Isilon’s scale out architecture is the real differentiator vs NetApp and other traditional filers. More bang for the buck just underscores the differences.

Comments welcome.

Sun counter-punches NetApp

October 29th, 2007 by Robin Harris in Enterprise, NAS, IP, iSCSI

Yippie-ki-yi-yay
As we say here in ranch country.

Sun sent out a press release on the NetApp fracas today. I didn’t have time to parse it, so here’s the raw intelligence:

Sun was legally obligated to respond in Texas to the initial suit brought on September 5, 2007 by Network Appliance to forestall competition from the free ZFS technology. Today we filed additional counterclaims in California, and specifically under the Lanham Act and California Business and Professions Code, based on Network Appliance’s false statements to the public about the alleged use of Network Appliance patents in ZFS. In parallel, we will be bringing a motion before the court in California asking that the case filed in Texas be consolidated with the case filed today for trial in the Bay Area, headquarters to both Sun and Network Appliance. Today’s filing includes counterclaims against the entirety of Network Appliance’s product line, including the entire NetApp Enterprise Fabric Attached Storage (FAS) products, V-series products using Data ONTAP software, and NearStore products, seeking both injunction and monetary damages.

Since Sun was forced to litigate, we feel California is a more appropriate venue to do so for several reasons. First, Sun and Network Appliance are both headquartered in Northern California, within 10 miles of each other. Second, most discovery will take place in California, as many of the key inventors on the patents and primary counsel for both parties are based in California. From both a judicial and economic standpoint, it makes much more sense for the case to be in California.

For more information about Sun’s counterclaims, visit our General Counsel’s latest blog posting: http://blogs.sun.com/dillon/entry/the_netapp_litigation_continued. You can view today’s filing on Sun’s website at www.sun.com/news, and you can check out our Open Source Community Support page at www.sun.com/lawsuit/zfs.

The StorageMojo take
Sun is pulling every string they can to win this in the court of public opinion. At least the public that buys storage.

What some commentators miss is that Sun only has to persuade a small percentage of NetApp buyers to reject or stall purchases to have a massive impact on NetApp’s share price. Here’s why.

Most tech companies - and I’ll assume this is true of NetApp - have back loaded quarters. A high percentage of the sales don’t come in until the last week of the quarter. By then, of course, all the expenses are fixed: components ordered; inventory built; 3 martini lunches expensed.

This means that the last few percent of sales make the quarter profitable - or not. If 90% of NetApp customers love them and continue to buy, but 10% decide they hate them and don’t buy, NetApp has a bad quarter.

Even a NetApp customer who loves them and hates Sun won’t increase their purchases to offset the lost sales. Why would they? The same kit will be cheaper next quarter, especially if the market stays soft.

Will Sun’s gambit work? Stay tuned.

Comments welcome, of course. Marketing is a contact sport.

Update: Since this suit is about ZFS, some of you may be interested in this article A look at MySQL on ZFS that compares the performance and management of MySQL on UFS and ZFS.

StorageMojo NPI

October 29th, 2007 by Robin Harris in Clusters, Enterprise, NAS, IP, iSCSI

New Product Introduction
As part of my campaign to increase the world’s consumption of disk capacity - see yesterday’s post - I’ve developed a new capacity gobbling product. For lack of a better term I call it a video white paper.

The impetus? No one reads anymore. Especially white papers. We’d much rather watch videos.

Enter Gear6
When I looked around for a launch customer, Gear6 came to mind. Their marketing VP, Gary Orenstein, has one of the few marketing blogs, Thoughtput with real content instead of “aren’t we wonderful” happy talk. He’s done a number of podcasts as well. He’s a new-media, large file size kind of guy.

Happily, he agreed to be the launch customer.

Here’s your part
Gear6 and Gary responded favorably to this new product and now I would like to hear from all of you. I am continuing to enhance the concept with the goal of bringing more value to everyone that views it.

What’d I’d like you to do is to watch the 4.5 minute video and tell me what you think. What works, what doesn’t work. What you’d like to see more of and what you’d like to see less of.

Meet Nisha Talagala, CTO

Nisha is not just really smart - smarter than the average Silicon Valley CTO - she is also a very nice person. I was impressed.

Yes, I was paid for this. And I’d like to be paid to do more of them! But only if they are worthwhile for you. So help me figure out how to make that happen.

Comments welcome - more than ever. My goal is to create something that is genuinely useful for information seekers in a 3-5 minute package. Tell me how well you think it works. How would *you* get more valuable content into 3-5 minutes in a way that people will watch?

Update: I tweaked the wording a bit. Same video.

NetApp filers for $1/GB?

October 22nd, 2007 by Robin Harris in Enterprise, NAS, IP, iSCSI, Price Lists

Get ‘em while they’re hot!
There is a rumor that NetApp, seeking to strangle baby Isilon in its crib, is giving away product to win deals.

At $1/GB I might buy one
If true, this could reflect continued weakness in NetApp’s results, as noted by analyst Tom Curlin at RBC Capital Markets in late July. They’d be plumping up the top line at the expense of the bottom line.

NetApp’s quarter closes Friday
If you are looking for a deal on a NetApp filer, this is the week to get one. Maybe if you call Isilon you can get an Isilon coffee cup overnighted to you to subtly make the point that you are looking at alternatives. At this late date though, just telling your NetApp rep that you are looking at Isilon and will delay the order for a week might get you the rumored break.

The StorageMojo take
Isilon is vulnerable right now. They’ve disappointed Wall Street for 3 quarters and that has hammered their stock. It is one thing to buy from a startup whose stock is trading at twice the IPO price and quite another to buy from one trading well below the offering price.

Taking advantage of a competitor’s weakness is smart business. And getting fabulous end-of-the-quarter deals is also smart business for storage buyers.

Update: Isilon’s VP of marketing, Brett Goodwin, wrote in to say:

Our core underlying business, technology, value prop—is unchanged. We also have $90M in cash and no debt. While we recognize that the stock price has taken a hit—it doesn’t reflect the market demand for clustered storage and Isilon’s leadership position in the category.

Fair enough. Disclosure: I have no financial relationship with Isilon. Darn!

Update I.V: Brett also said he has 125 nifty Isilon coffee cups in stock and ready to ship. Call for yours today!

Update II: RBC Capital Markets is also saying that EMC is having a tough quarter in the enterprise storage space. Flirting with Isilon will enhance your bargaining position with EMC come December.

Comments welcome, of course. Tell me about your NetApp deals, if any.

pNFS technical intro

October 15th, 2007 by Robin Harris in Architecture, Clusters, Future Tech, NAS, IP, iSCSI

I don’t normally link and run but this is a good article on the Next Big Thing in NFS v4.1.

Written by 3 NetApp engineers, Garth Goodson, Sai Susarla, and Rahul Iyer, Standardizing Storage Clusters offers a good overview of what’s new. It’s on the ACM Queue web site.

If paragraphs like

protocol operations

The pNFS protocol adds a total of six operations to NFSv4. Four of them support layouts (i.e., getting, returning, recalling, and committing changes to file metadata). The two other operations aid in the naming of data servers (i.e., translating a data server ID into an address and getting a list of data servers). All the new operations are designed to be independent of the type of layouts and data-server names used. This is key to pNFS’s ability to support diverse back-end storage architectures.

get you interested the article is well worth a read.

The StorageMojo take
pNFS is going to commoditize parallel data access. In 5 years we won’t know how we got along without it.

EMC’s coming strategic shift

September 10th, 2007 by Robin Harris in Clusters, Enterprise, NAS, IP, iSCSI

I’m always curious about the context of the communications as well as the content. The Bush administration, for example, has been very disciplined in releasing bad news late Friday in the reasonable expectation that most people won’t ever hear about it.

Why some enterprising editor doesn’t have a Monday morning front-page box: “What the White House doesn’t want you to know” is beyond me.

So I get an EMC press release on Friday . . .
Nothing nefarious, since the release actually went out Thursday and didn’t get reported until Friday. The two major bits are:

  • “. . . former Dell and Bain executive Louise O’Brien has joined the company as Executive Vice President, Corporate Strategy and Development. O’Brien will report to Joe Tucci, EMC Chairman, President and CEO, and be responsible for overseeing EMC’s corporate strategy, mergers and acquisitions, Office of the Chief Technology Officer (CTO), and New Ventures Group.”
  • ” . . . EMC promoted three of its senior executives to President. Mark Lewis, 45, has been named President of EMC’s Content Management and Archiving (CMA) business, after having served most recently as Chief Development Officer (CDO); David Donatelli, 42, has been named President, EMC Storage Division; and Howard Elias, 50, has been named President, EMC Global Services.”

Let’s see, the CTO reports to a sales, marketing and strategy person
Him-m-m? Nothing new for EMC, whose CTO has typically been tasked with making a dog’s breakfast product portfolio look good to customers. EMC has always been a sales company where technology is secondary. Just an observation folks.

No, this is the interesting part
Putting Lewis in charge of Content Management and Archiving. He’s been an internal advocate for storage clusters and grids, which horrifies the Symm folks, but there is no doubt that clusters are coming and EMC has to do something.

So how to thread the needle, i.e. keep up the high-margin Symm sales while slowly introducing scalable storage clusters without inducing a mass migration? Simple. Sell storage clusters as the place where your data goes to die.

Massive, cheap - compared to Symms, but the gross margin remains sacred - capacity with some nifty lock-ins to keep customers coming back for more. Archive meta-data?

ILM rises from the dead
ESG, always a reliable indicator of and cheerleader for EMC thinking, is pushing the model of dynamic data and persistent data. There is a lot more persistent data so you’ll need a lot more capacity but without the performance requirements of database apps.

Enter the grid
EMC has been seeding money among innovative startups for years with a special emphasis on network-based storage. Now it is harvest time. I expect they’ll be buying, under Ms. O’Brien’s watchful eye, some of the grid/cluster/network companies they’ve invested in.

Persistent storage is actually much harder than dynamic storage because it is anti-entropic. And if you can get a customer to buy enough you will have an annuity business for many years.

The StorageMojo take
Yet the competition isn’t standing still. The movement of companies into cluster-based storage isn’t over by a long shot. The line between persistent and dynamic data is drawn by Moore’s Law and the system architects, not by the data itself.

Oracle’s adoption of direct NFS and the coming pNFS standard both point to a world where massive capacity clusters are also capable of massive IOPS with low latency. And archival storage based on open standards will have an intuitive appeal that even EMC’s high-commission sales force won’t have much luck fending off.

Nonetheless, expect EMC to introduce their first cluster product by year-end ‘08. They’re hoping to get it out before June but I don’t think they’ll make it. This stuff is always harder than it looks.

Also, I tap Ms. O’Brien as the next CEO of EMC. She may look like a dark horse, but those Bain alums are smooth operators.

Comments welcome, of course. Also, I am formally abandoning my promised Part III of EMC has Ph.D’s Pt I and Part II. Every time I looked at the list of EMC patents I’d have terminal brain cramp. Maybe an ardent EMC’er will do part III instead.

And it’s a bit pointless anyway. EMC buys its innovation if I’ve heard them correctly.

F5 acquires Acopia

August 6th, 2007 by Robin Harris in Enterprise, NAS, IP, iSCSI

Seattle-based F5 networks announced today their plan to acquire Massachusetts-based Acopia Networks. I think it is an interesting tie-up.

Is storage just another network service?
A trick question. Of course storage is, and of course, is isn’t.

Networks specialize in the transient. Storage specializes in the persistent. To the extent that networks enable storage, storage is a network service. To the extent that storage persists it is its own unique domain.

And please, don’t get started on the “network as giant delay line” argument.

Acopia sells a file server that front ends other NAS boxes
By virtualizing across multiple file servers, they are able to automate processes, such as tiering, that enable higher utilization of capacity. Which fits with F5’s focus on optimizing Ethernet networks for application delivery. How many applications rely on storage?

The financials
Acopia got over $80 million in funding, and F5 is paying $210 million in cash. It isn’t the 10-bagger of VC dreams, but 2.6x means everyone gets to play another day.

The StorageMojo take
A bunch of folks tried to build file virtualization boxes and most of them failed. F5 will get a nice boost by starting to tap the 45% of data center hardware spend that is storage.

The problem is that virtualizing NAS servers gets a lot less interesting in an NFS 4.1 world. Clustered NAS with scalable parallel data access eliminates many of the problems that Acopia set out to solve.

Widespread adoption of 4.1 is a couple of years off. In the meantime I expect F5 to do quite well with their new acquisition.

Comments welcome, as always.

IBM Fellow Steve Hetzler

April 4th, 2007 by Robin Harris in Clusters, Future Tech, NAS, IP, iSCSI

Venturing forth from the Shire
I went back to Silicon Valley last week to see if I could still deal with more than three people in a room. Living in a remote mountain valley and working at home, I don’t get out much. Visited Cisco and some clients. A good trip.

The best surprise was the charming and efficient Sara Delekta Galligan emailing me to say that Steve Hetzler had some free time and would I care to meet with him? Of course! I’d come across some of Steve’s iconoclastic thinking on the web and emailed him to see if we could talk

Steve works in the gorgeous IBM Almaden research facility up in the hills south of San Jose. And his office has a way nicer view than yours.

Steve Hetzler, storage rock star
Steve’s write up on the IBM site:

Steven R. Hetzler is an IBM Fellow at IBM’s Almaden Research Center (San Jose, Calif.), where he manages the Storage Architecture Research group.

He is currently focusing on new architectures for creating highly fault tolerant storage systems, iSCSI data storage systems and markets and applications for nonvolatile memories. iSCSI is a protocol for managing storage over IP networks that he initiated within IBM Research and also named. His group wrote the first draft specification, developed the first working iSCSI demonstrations, including the first direct network-attached DVD movie multiplex, and was active in helping develop iSCSI into an industry standard.

A prolific inventor, Hetzler has been issued 35 patents for inventions in a wide range of topics — including data storage systems and architecture, optics, error correction coding and power management. His most notable patents include split-data field recording (issued in 1993) and the No-ID(TM) headerless sector format (issued in 1995), which have been used by nearly all magnetic hard-disk-drive manufacturers for a number of years. He also pioneered the first adaptive power technology for disk drives, which is also widely used in disk drives for mobile computers.

Hetzler has received numerous IBM awards for his work, including three Corporate Awards, and a Corporate Environmental Affairs Excellence Award. He is a member of the American Physical Society, a senior member of the Institute of Electronics and Electrical Engineers and a member of the IBM Academy of Technology.

A native of Red Wing, Minnesota, Hetzler was educated at Carleton College (Northfield, Minn.), where he received a Bachelor of Arts in Physics in 1980, and California Institute of Technology (Pasadena, Calif.), where he received his Masters and Ph.D. degrees in Applied Physics in 1982 and 1986, respectively. He joined IBM Research in November 1985 and was named an IBM Fellow in 1998.

He’s an enthusiastic guy who likes thinking about storage
My note taking skills are pretty bad, so I can’t even begin to transcribe an interview. And it wasn’t really like an interview anyway. Steve just started rolling on storage issues and I tried to hang on.

Here are a few of the topics Steve broached. He thinks and talks fast, and I listen and type slow, so please consider these my impressions rather than quotes. Also, I’ve organized my random notes into topics and inserted what I thought fit. Bottom line: if something sounds stupid it is my fault.

About the industry:

  • People frightened that Moore’s law won’t continue
  • Storage companies tend to ship not what customers want but what the company can deliver
  • Storage requirements:
    • Cheap
    • Reliable
    • Simple

That sounded about right to me.

IP-based storage

  • IP-based storage is about sharing the closet rather than sharing the clothes - the file sharing capability is secondary to having a big convenient place to put stuff
  • Is popular because it is on the IP network, i.e. cheap, reliable, simple

Steve used to work on disks and had some thoughts on the AFR controversy

  • Disk AFR measurements are flawed
  • Accelerated test methodologies are focused on a specific failure mechanism - temperature - which, as Google found, isn’t a critical issue
  • Weibel plots are simply descriptive - no underlying mechanism or ratio is assumed - so they have limited value as a tool for understanding disk behavior

On storage systems

  • Design challenge: highly unreliable components making a highly reliable system
  • RAID 5 & 6 don’t scale well to petabyte systems. One reaon: rebuilds are, in effect, Denial-of-Service attacks: the rebuild typically cuts I/O performance by almost half for the affected RAID
  • Disk capacity up 4 orders of magnitude while hard error rate hasn’t changed
  • Logical incremental improvements of the RAID concept have gotten us to a point where we wouldn’t choose to be if were were designing today from a clean sheet of paper

But wait! There’s more!
Steve graciously slowed down long enough to try to educate me on some of the finer points of NAND flash problems as a solid state disk and some issues with 2.5″ drives. I didn’t get most of it but I’ve got a big yellow note on my monitor to “figure it out!” so maybe, one of these days, I will.

After 90 minutes of geek speak the patient Ms. Galligan suggested we might want to wrap it up. Good thing too: I made my flight with only five minutes to spare.

What is Steve and his team working on today?
Steve wouldn’t talk about his current project other than to say it is something cool and maybe, just maybe, he might have something to demo later this year. I’m pretty sure this will be a prototype rather than a product, so don’t start saving your pennies to buy one just yet.

Steve wouldn’t say what he is working on, but that doesn’t stop me from guessing. My SWAG: a cheap (by array standards), highly reliable, block-based storage system, running over IP and managed by a clustered software layer that protects data at the block and file level, rather than using RAID. I can hardly wait to see what it really is.

The StorageMojo take
Data storage and information management are, IMHO, the biggest problems in computer science today. The exponential growth in stored data creates huge challenges and opportunities for scale-out storage architectures. It is good to see smart folks like Steve working on these hard problems.

And kudos to IBM for supporting this kind of basic research. I think it beats the heck out of “checkbook innovation” as widely practiced in the industry. But that’s another post.

Comments welcome, of course. Steve, Sara, hope I didn’t get anything too wrong. If I did I’m happy to go back and update.

Update: Sara sent me a couple of modest suggestions, which I’ve updated, as well as a more current bio for Steve.

Files, Objects and Blocks, Oh My!

March 16th, 2007 by Robin Harris in Future Tech, NAS, IP, iSCSI

Virtualization is the answer. Now, what was the question?
The drumbeat for virtualization as the answer for the storage world’s ills continues unabated. Yet I wonder if we are virtualizing the right things and, if we are, doing it in the right way.

I got into the computer business in 1981, when virtual memory superminicomputers were still the coming thing. Folks had figured out that due to the cost of memory *and* the cost of dealing with fixed memory capacities, that memory was the right thing to virtualize.

Yet the early implementations were clunky and prone to non-productive behaviors like thrashing. It took a while to engineer a virtual memory system that provided a good illusion of physical memory. How many people even know they are using virtual memory today?

The Turing test for virtualization
You can’t tell whether it is virtual or real.

As scary as lions, tigers and bears?
Maybe blocks should be.

Grand virtualization architecture visions
The dotcom boom saw at least a couple of dozen storage virtualization startups funded, at least for a while. A surprising number still survive. Major storage companies launched storage virtualization programs.

  • Virtualization in HBAs
  • Virtualization in switches
  • Virtualization in appliances

And more.

Blocks are the problem. What is the answer?
My thought: maybe OSD has the right idea. Maybe by virtualizing a really basic and largely irrelevant resource - blocks - we can advance virtualization without a costly rejiggering of everything else in storage.

I pinged a very smart and highly experienced engineer I know to ask him what he thought about OSD. A member of the T10 committee on OSD, he asked that I not give his name. He considers his comments a SWAG, and he is not professionally given to making unresearched statements.

His response had a couple of threads
The first is what OSD could do.

One of the most interesting things about OSD was that you could provide a certain amount of information about an object in the SCSI command set defined for it. This could have been the basis for some security, access management, and information life cycle management solutions.

My take away is that OSD could offer new infrastructure for managing data, were we to adopt it.

His second point is where OSD fits.

I have always believed that people cared about files, and that blocks and objects were just interesting ways to construct files. Data base and transaction processing applications may be a significant exception to that because they attempt to optimize their behavior at a much lower level, though that may be a temporary expedient due to present performance limitations.

Just as people once programmed in ones and zeros before moving to assembler, perhaps blocks are the ones and zeros of the age of massive storage. We have to stop thinking about them to achieve useful virtualization, and let the machines handle blocks so we don’t have to.

Comments welcome - we got some good ones on the first OSD post - thank you all. Moderation is a virtue. Have a good weekend.

Objects To Desire

March 15th, 2007 by Robin Harris in Enterprise, Future Tech, NAS, IP, iSCSI

The Object of My Affection

Why do we manage blocks? That construct is getting old.

You might say we manage blocks because disks have blocks and we build storage out of disks. But what if disks didn’t have blocks? No more block management. We’d simply manage . . . . OK, what would we manage?

Enter the object
Bruce Lee could make an entrance. Objects, not so much.

Like files, objects contain data. But they lack several things that would make them files. They don’t have:

  • Hierarchy. Not only are all objects created equal, they all remain at the same level. So you can’t put one object inside another.
  • Names. At least, not human-type names like Pamela Anderson, Claudia Schiffer, 2006 Taxes or Brad Pitt.
  • User access control. Objects just lie there like a dollar on the street, waiting to be picked up. Objects don’t know who they belong to.

The missing synch
A file system’s user-facing component provides those missing elements. You decide which files belong in which folders. You give the files names. You decide which users have access to which files and what those users can do with those files.

So objects all look alike. Some are bigger and some are smaller, but until we get them dressed and named, they aren’t files. Yet they are a lot closer to files than blocks are. Which means that if you choose to manage objects you no longer have to worry about blocks.

Is this going anywhere?
Patience, padawan. Object-based Storage Devices or OSD. OSD is a standards effort defining how storage devices, like disks, disk arrays or tape libraries could present a standard, SCSI-based, interface to serve objects to the user-facing file system component. This moves the processing required for figuring out which sectors contain a file’s data from the server to the storage device.

Here’s a diagram of the change from block-based to object-based storage:

OSD Model Diagram

The StorageMojo take
OSD is cool, if it ever comes to market. Moving processing off the main CPU frees up cycles for mission critical applications, like improving the frame rate in Quake.

Today’s disk drives have more computes, RAM and I/O bandwidth than $300,000 minicomputers did 25 years ago. Why let all that capability go to waste, especially as data volumes explode?

So the device will manage the objects and the server/workstation will manage the file user interface. We are still moving away from the model of the original disk drives where the CPU directly managed the head movement. OSD will help create more scalable and easily managed systems. This is another important step towards building the massive scale-out storage systems of tomorrow.

Comments welcome, as always. I think OSD is cool, but maybe you don’t. I’d like to know why.

HP’s Bold Move Into Storage Clusters

March 12th, 2007 by Robin Harris in Clusters, Enterprise, NAS, IP, iSCSI

A shot across the bow
HP’s acquisition of Polyserve is a ~$250 million (my SWAG, we’ll have to see what, if anything, gets reported on the 10K) bet on the future of storage. And I think it is a good one.

HP needed to do something. Their external storage systems business has been flat while EMC has been taking share. They still have a strong number two position, but that is not the way to win Mark Hurd’s love.

Polyserver: commercial storage cluster
I haven’t devoted much time to Polyserve, but at a high level I like what they are doing. Polyserve takes the NAS storage cluster concept right into the heart of commercial applications. They support Oracle databases (learn more at Kevin Closson’s excellent blog), SQL Server and DB2. This is where the big iron storage boxes live, as well as their less-costly mid-range cousins.

A good overview of the buy and HP’s reasons is at Red Herring. Yet the importance of the acquisition goes well beyond HP.

Cluster storage land-grab
The real impact is on every other major storage player. As I’ve noted, storage clusters aren’t coming, they’re here. HP’s move just underscores that fact.

The wide-awake storage players are now putting out feelers to buy or partner with every decent storage cluster technology play. If you have a storage cluster startup and don’t get a phone call from someone in the next month, maybe your stuff isn’t all that cool. Or your genius isn’t appreciated by a clueless world.

EMC already has investments in a number of next-gen startups - such as my former company YottaYotta - which they’ll need to take to the next level by either licensing or acquiring. Since EMC’s storage business has been growing at a healthy clip, they may not feel the need to act fast, leaving it to competitors to claim the high ground.

IBM always has at least a dozen options on a low-boil, but deciding to pull the trigger on one and the plug on another just isn’t their thing. With their blade server leadership it would seem a natural direction, but they punted on storage arrays too.

Sun is well-positioned to use x4500’s with Google-style clustering to create low-cost, high data integrity storage clusters. If the Solaris group takes it on, it could be good. If they don’t, well, don’t hold your breath.

The StorageMojo take
Bravo to HP for buying Polyserve. If they roll out the products and support in a timely manner they will steal a march on everyone else in the industry. Their biggest problem will be selling Polyserve to their sales force, a skill HP has yet to master. Lower ASPs always give sales people the willies so it is vital that they feel like they are taken care of. Like Sun’s mis-marketed x4500 the Polyserve products could find themselves in a completely undeserved oblivion if HP’s storage group just tosses them over the wall to sales.

Comments welcome, as always. I’m traveling until Thursday so I may not moderate as fast as I normally do, but moderate I shall!

NetApp Weighs In On Disks

February 26th, 2007 by Robin Harris in Enterprise, NAS, IP, iSCSI

As promised in the Open Letter to Seagate, Hitachi, EMC, IBM, NetApp, HP and Sun, StorageMojo is giving this space to NetApp to respond to Everything You Know About Disks Is Wrong and Google’s Disk Failure Experience.

Props to NetApp for being quick off the mark in responding [gee, maybe there is a reason they are the fastest growing large storage vendor] and to NetApp’s Director, Technical Strategy, Val Bercovici, who worked over the weekend to craft a data-rich response. Val’s response is unedited by moi. IMHO, Val has done a good job discussing the issues while keeping self-congratulatory chest beating under control To improve readibility I’ve added some bolded sub-heads in [brackets] which are mine alone.

And all you other guys, getting your lunch eaten by NetApp, the invitation still stands. The NetApp response begins here:

[NetApp feels a bit like Al Gore now]
It’s probably fitting that I’m writing this during the Oscar Awards weekend. Much like the Oscar broadcast itself, this is a long blog post and many overnight sensations will be discussed following the Oscar broadcast. In fact the suddenly hot topic of disk failures and resulting impact on data availability & resiliency might seem like yet another overnight sensation, courtesy of mainstream media coverage such as the “beeb”. However, most professionals in the far less glamorous storage industry know that like all overnight sensations, this one too has actually been many years in the making. Stretching the Oscar theme a bit more (regardless of political affiliation) many of us at NetApp also feel a little bit like Al Gore now. Let me explain …

It may surprise many of those reading StorageMojo (perhaps even the admin himself :-) but NetApp is actually *thrilled* at the attention this whole topic is now gaining, and much like Al Gore we feel somewhat vindicated since we’ve been banging this drum for a while. I’ll be addressing all of Robin’s provocative points regarding the credibility of the storage industry (specifically drive & array manufacturers) below, but a little bit of NetApp history in this regard will add important context to my response.

[Disrupting ourselves by cannibalizing FC disk storage sales]
Back around the year 2000, NetApp’s thought leaders observed that the gap in density between consumer-oriented drives (then known mostly as IDE, today as xATA) and enterprise drives (SCSI & FC) was becoming too big to ignore. It was clear to us that Clayton Christensen’s “good enough” principle from his seminal disruptive technology work would clearly apply in this case. So our choice came down to either disrupting ourselves by cannibalizing some of our FC disk storage sales with lower revenue per-capita ATA drives, or watch somebody else do it to us. I’m glad we phrased it that way internally since it became an easy choice in hindsight.

That decision prompted NetApp to release in 2001 the first enterprise-class storage array based on ATA technology at new price points previously unavailable to the online storage market. The NetApp NearStore thus created the “Nearline storage” market segment. Little did we know that would thrust us into a virtuous circle where we also learned some hard lessons. The innovation we applied to overcoming those lessons learned has directly contributed to our dominance of the Nearline storage market, as well as ultimate industry capacity leadership in the overall “networked storage” category tracked by IDC. Yes that means today we ship more array-based FC & ATA disk capacity than EMC, HP, Sun and our OEM partner IBM as listed here in StorageMojo’s open letter. That key statistic helps add unmatched credibility to our responses surrounding this issue and the specific points raised below.

[FAST forward]
FAST (pardon the pun) forward to 2007 and the Google & CMU studies, resulting IT media / blogosphere coverage, plus resulting StorageMojo open letter. Let’s review the key points raised:

1. Failure rates are several times higher than reported by drive companies

Most mature storage array vendors already know this and devote serious engineering, disk qualification / testing and field support resources to mitigating the resulting customer risk. Conversely, most experienced storage array customers have learned to equate the accuracy of quoted drive failure specs to the MPG estimates reported by car manufacturers. A classic case of YMMV and often will if you deploy these disks in anything but the mildest of eval / demo lab environments.

2. Actual MTBFs (or AFRs) of “enterprise” and “consumer” drives are pretty much the same

This tidbit known mostly to industry insiders is largely true, especially when comparing comparable drive sizes. But how storage arrays handle the respective drive type failures is what continues to perpetuate the customer perception that more expensive drives should be more reliable. One of the storage industry’s dirty secrets is that most enterprise and consumer drives are made up of largely the same components. However, their external interfaces (FC, SCSI, SAS or SATA) and most importantly their respective firmware design priorities / resulting goals play a huge role in determining enterprise vs. consumer drive behavior in the real world.

[Firmware more reliable than people - good]
Considering the awe-inspiring areal density of the platters themselves, combined with drive firmware size and complexity rivaling entire operating systems of a few years ago, NetApp’s storage subsystem team considers contemporary disk drives “miracles of modern engineering”. In fact, that resulting firmware size and complexity is beginning to resemble the anthropological and demographic behaviors of human beings themselves!

To wit, consumer-class drives’ personality is determined by firmware that assumes the drive is isolated inside a laptop or desktop and cannot rely upon parity information stored on adjacent (peer) drives to recover from a partial or full error condition. Consequently, consumer drives exhibit “Type A” personalities as they heroically go offline for non-deterministic periods of time (a few seconds or many minutes), to “take charge” of the situation and perform various pre-programmed techniques attempting to resolve bad blocks, media / checksum errors, etc… Unpredictable and non-deterministic timeouts during these occurrences inside storage arrays can present some challenging circumstances to array designers – yet as one can easily see the end result will not always be a “failed” drive. Safe & efficient approaches to handling this situation without disruption and often without even physical removal of the drive itself, is one of the innovations NetApp delivers with our “Maintenance Center” suite of disk resiliency technologies which I cover in a bit more detail responding to the next point below.

[Going native]
OTOH enterprise-class drives exhibit markedly different group dynamics since their firmware makes the assumption that they are usually deployed as a member of a RAID set – and should consequently defer to their peers for mirror or parity-based recalculation / recovery during the same set of error conditions cited for the consumer drives above. That makes for much more deterministic behavior which has historically enabled storage arrays using exclusively enterprise-class drives to compensate in a much more consistent and predictable manner when drives fail. The makers of enterprise-class storage arrays now face some daunting challenges as they incorporate consumer-class drives while maintaining the same historical service levels. A quick scan of various enterprise storage vendors’ spec sheets quickly reveals which ones have risen to the engineering challenge with native SATA support vs those that have punted responsibility back onto the drive manufacturers themselves by using lower volume (higher price / not consumer-class) hybrid drives known as FATA or LC-FC. :-)

FYI – NetApp storage engineering has actually moved on to tackle even more challenging consequences of today’s popular & complex dense disk drives. Many Enterprise storage arrays (and increasingly popular filesystems such as ZFS) have evolved sophisticated checksumming algorithms to verify the correctness of normal read operations. Some array vendors go the extra mile to continuously monitor such checksums on data that is not normally (or perhaps never) read after it is initially written. To the best of our knowledge, NetApp is the only array vendor to take the final step and check for the incident known as a “lost write” which conventional checksumming approaches do not (yet) catch. The risks of silent data corruption loom large in any filesystem, disk drive or storage array which does not account for the potential of “lost writes”.

3. SMART is not a reliable predictor of drive failure

We believe this is one of the most tangible points that separates the “men from the boys” in this industry. Few if any of the storage newcomers in this market have endured the real-world field experiences required to come to this difficult realization and make the necessary engineering & support investments to compensate. NetApp considers our solutions in this regard a distinct competitive advantage, so we’ve explicitly decided to drive public industry discussion of this issue. Forums such as the FAST conference have played a key role. Both of this year’s Google and CMU studies refer to seminal NetApp work in this area, and just like the CMU paper here in 2007, NetApp’s RAID-DP won “Best Paper” at the FAST ’04 Conference. Yet as pointed out correctly on this blog RAID-DP (a performance-optimized variety of SNIA-defined RAID 6) is merely a key part of the protection spectrum against this issue, not all of it.

Quick backgrounder substantiating our position – NetApp shipped over 104 PB (petabytes) of capacity during our last reported quarter (ending in Jan 2007). Since we didn’t publicly disclose the number of spindles that equates to, I’ll do some the back of the napkin math blending the 500, 300 & 146GB spindle varieties to arrive at a rough average of over 150,000 spindles per month, which by itself every month is well above the total amount of drives covered in each of the cited Google & CMU studies.

[Making SMART smarter]
Much like Google, NetApp has accumulated over the years a massive data warehouse of real-world drive behavior but under a much broader range of production deployment environments and configurations. We track drive ongoing behavior reported on a weekly basis during normal working states as well as in an event-driven manner during the various stages of drive failure. That has enabled us to surround the basic SMART information provided by the drive manufacturers with a comprehensive set of technologies branded as “Maintenance Center” (introduced in my response above) which enable NetApp arrays to take highly safe, accurate, granular and efficient predictive actions described in the response to the next point below.

4. Drive failure rates rise steadily with age rather than staying flat through some n-year mark

The relatively controlled sample sets of the Google & CMU studies enable them to arrive at more specific conclusions than NetApp has observed. OTOH since NetApp will soon ship more drives per month than both of those studies’ multi-year sample sets combined (to a much broader set of production deployment environments) we have learned that the actual list of possible reasons behind drive failures gets longer with the introduction of each new drive model. Consequently there are many best-practices we recommend to storage array administrators, which are derived from the consistent set of resiliency features we supply as a vendor of storage arrays to small, medium and large-sized organizations of all kinds.

[Drive resurrection]
If there’s one thing we’ve learned as a result of the massive real-world drive behavior data warehouse we’ve accumulated – it’s that there’s no simple pattern to predict when a drive will fail. But by far our most significant discovery is that drive failures are actually no longer the simple atomic and persistent occurrences they used to be a few short years ago. There are in fact many circumstances not restricted to age, environmentals (NVH), power & cooling, or even electro-mechanical behavior of drive peers within the same array, which can render a drive unusable – and eventually failed. One of the most fascinating Oscar-worthy plot-twists that we’ve uncovered as a result of our vast experience is that drives can also come back from the dead to lead very normal and productive lives! Industry-leading innovation we’ve been shipping with NetApp Maintenance Center allows a NetApp array to use algorithms derived from our aforementioned data warehouse to take intelligent proactive actions such as:

  1. Predict which drive(s) are likely to fail (using advanced algorithms based on our vast data warehouse).
  2. Copy readable data directly from the failing spindle onto a global hot spare without parity reconstruct overhead.
  3. Use RAID-DP parity to calculate the remaining subset of unreadable data (usually a very small percentage of the overall drive).
  4. Take the suspected “failed” drive offline (while physically maintaining it in the array) and probe said drive with low-level diagnostics to determine whether the failure was transitory or truly and permanently fatal.
  5. Return fixed drives which exhibited only single-instances of transitory errors back to the global hot spare pool.

Although we’ve only been collecting statistics on the advanced Maintenance Center functionality for about a year now, our assumptions have been validated in that the vast majority of “failed” drives only exhibit isolated incidents of transitory errors and can safely remain in the array while rejoining the spares pool. It should be noted that these drives don’t get a second chance at a second life :-). Should those same drives fail again in any manner, they are marked for physical removal from the array and considered permanently failed.

[If it ain't broke . . . ]
There are of course many net positive tangible NVH & electro-mechanical advantages in avoiding physical drive removal events from any storage array, which contribute to a different kind of NetApp virtuous circle around overall storage system RAS. Notably, an indirect yet significant benefit of more granular and intelligent drive failure management afforded by NetApp Maintenance Center is improved supply chain efficiencies NetApp customers enjoy. This comes as a result of the reduction in the expensive cycle of drive removal, RMA administrative processing & shipment, plus drive replacement shipping, handling & asset management.

5. Array disk failures are highly correlated, making RAID 5 two to four times less safe than assumed

This is an excellent final point. For readers that made it this far, there’s one takeaway I hope everyone remembers from this discussion. Given the realities of today’s drives (plus all the trends indicating what we can expect from electro-mechanical storage devices in the near future) – protecting online data only via RAID 5 today verges on professional malpractice.

That’s a deliberately strong and provocative statement. I use it often to raise awareness of this very real industry issue and when outlining NetApp competitive advantages such as RAID-DP & Maintenance Center in this regard. Apart from more capacity efficiency than both RAID 5 (in typical 3+1 or 4+1 best-practice configs) and RAID 1/0, RAID-DP (or any sturdy variety of RAID 6) is also becoming a necessary complement to the increasingly dense spindles most organizations are pressured to purchase for financial reasons. Patterson, Gibson and Katz defined some excellent RAID levels with their seminal work based on spindle realities of the eighties. 20 years later it’s time to retire those legacy RAID levels and define and implement modern ones which address the realities of contemporary drive technology in the 21st century.

[RAID 5 today verges on professional malpractice]
In more conservative, controversy-phobic settings one can tone down the rhetoric and merely refer to the copious 3rd party evidence we cite in this regard, including (but certainly not restricted to):

  1. Enthusiast-oriented reports such as AnandTech, quoting an “8% chance of complete data loss using RAID 5 with 200GB spindles”
  2. Seagate & Microsoft’s WinHEC 05 Presentation (SATA in the Enterprise) – “Call to Action: Use (only) RAID 1 or RAID 6 in SATA Array”
  3. IBM Research in Almaden (S. R. Hetzler, IBM Fellow) quoting a controlled study of large capacity drives “With only 2 9’s reliability - RAID 5 is insufficient with SATA”
  4. The “father of DEC StorageWorks” (now HP EVA) quoting that “If you have one petabyte of desktop drives with RAID 5, you could lose data twice a year”

Note that leading the industry in terms of transparency on the “inconvenient truth” of RAID 5 today required some sacrifices on NetApp’s behalf the past few years. Our sales force keeps reminding me that NetApp doesn’t win many brownie points among the uninitiated RFP writers out there in customer-land who are scoping out conventional RAID 0/1/5 solutions. Instead of coming across as self-serving scare-mongers when explaining “RAID 5 is not enough”, we at NetApp hope broader coverage of this important issue will help storage customers make more informed and safer array purchasing decisions.

As is clear by the length of this response and record number of blog comments on related posts here at StorageMojo, this is a rich and deep often esoteric topic with many nuances. Many readers who make a living higher up the storage stack at the host / server or application level may wonder whether or how all of this relates to them? Perhaps the best example of that comes from NetApp’s strategic database alliance partner and major customer Oracle Corp.

Having been in this business a long time and learning some hard storage lessons of their own, Oracle developed a storage resiliency certification program actually named H.A.R.D. Very few Oracle certified storage array partners are able to qualify for this exclusive program, and while NetApp is naturally one of them – we are proud to also be the only storage array vendor in this program that offers the highest form of database storage integrity across our entire online storage product line. All other compliant array vendors restrict this to their highest tier of storage available only to their most lucrative customers. Seems kind of unfair and disappointing for the increasing majority of enterprise storage customers considering modular (mid-range) storage arrays in support of Oracle databases and associated (usually mission-critical) applications. For related archive storage requirements, anything less than this high level of data integrity over the long-term would also be considered a major technical, business and legal issue.

Given the complexity involved with the technology behind the findings raised by the Google and CMU researchers lately, perhaps it’s best to close with a quote from the former US President still most closely associated with Hollywood and the Oscars “Trust – but Verify” :-)

Val Bercovici (NetApp)
Director, Technical Strategy

Comments welcome, as always. I’ll be a lot more familiar, I hope, with NetApp after attending their analyst event in a couple of weeks. So don’t be shy about giving your view of Val’s response. Moderation on to keep spamsters under control.

Inside Skinny On Isilon

February 21st, 2007 by Robin Harris in Clusters, NAS, IP, iSCSI

Already did the outside skinny
I talked to Sujal Patel, founder and CTO of Isilon, last week to learn more about Isilon. As I’d mentioned in my posts on the company I was surprised that there wasn’t more technical information about the products on their website. On the other hand, I believe that may be a wise decision, since my experience is that if you talk too much about the “how” prospects tend to forget about the “what”. And the “what” is what sells.

But I like the “how”!
So Sujal agreed to a bit about the how and to respond to some concerns about the Isilon architecture with me. I also got some business info as well. I’m supplementing Sujal’s comments with a bit of information from SEC filings and RBC Capital Markets research.

Business first
Isilon today has over 300 customers, 88 added just last quarter. They’ve been recruiting resellers heavily, which makes sense to me since with such an easy-to-manage product they don’t need the costly SE hand holding that most SANs requires.

Rain city veterans
Sujal started Isilon in Seattle about the same time as YottaYotta, where I ran marketing for several years. 2000 was a tumultuous year: Gilderesque predictions of bandwidth nirvana; rampant easy money delirium; overheated visions of massive internet investment. That both are still around is a minor miracle. Sujal said that for Isilon, remarkably, the goals have stayed the same. The product intended for media servers and other large-file, mostly sequential workloads. They kept at it and the business developed. They didn’t allow themselves to get distracted by the uproar of the dot-bomb crash, stayed focused, and the rest is history.

One measure of their success: 50% of their business is from repeat customers. A couple of very large customers - Kodak and Comcast - account for less than 20% of their sales, but as they bring on new customers their importance is declining.

Down is the new up
Their move downmarket, with the new IQ200, a 3-node cluster listing for less than $40K was, Sujal reports, driven by the channels focus. Not only does that not surprise me, but I’d guess there will be an under $20k IQ100 late this year. Three node VAXclusters were the most popular in the 1980s due to the functional redundancy. Lose one, still have two-thirds of your processing power and some redundancy. I don’t think that dynamic has changed.

Isilon also has very good gross margins - over 50%. Not as good as the products I’ve marketed, but hey, they’re young. I love to see vendors making good margins while providing good value: they’ll be around. Buy with confidence.

Now for the technology
Isilon’s technology is more complicated than the Google File System, which is my favorite model of a bare bones, low-cost, scale-out cluster storage system. Isilon’s system is fully distributed, which is more elegant than Google’s master/server architecture, yet adds overhead. Isilon supports RAID5-like functionality, Google nothing but file replication.

With Isilon you can choose n+1, n+2, n+3, n+4 redundancy on a per-file basis at a cost of 20% increments of capacity usage. If a node fails, the system uses its Virtual Spares to maintain the requested redundancy.

Linear, let’s get linear
A fully distributed system needs to do a lot of message passing to keep everyone coordinated. These messages are small, but their latency is a problem. That is why their no-cost adoption of Infiniband is smart: the under 100 ns latency is a real performance boost. They handle subnet management in their software, so customers don’t need to be Infiniband gurus to get it working.

Sujal says that messaging traffic grows linearly as you add nodes. There seems to be a couple of reasons for that. First, even in a large cluster there are typically no more than 16 nodes involved in any one I/0. Second, Isilon uses three meta-data “authorities” for each file. So while the data may be spread from here to eternity, the metadata isn’t, reducing coordinating traffic required to handle updates and cache invalidation.

Another strategy for reducing latency in a fully distributed system: NVRAM. Isilon uses battery-backed NVRAM cards to eliminate disk writing latency. Each node can safely acknowledge a write in microseconds instead of waiting milliseconds for a disk I/O to complete. Sweet.

96 nodes to the tune of “96 Tears”
Isilon currently has a 96 node limitation, which I found interesting because VAXclusters also topped out at 96 nodes. Made me wonder if there was something mystical about the number 96. Sujal says no, it is a testing limitation. They started with 12 nodes and have gradually raised the number to 15, 32, 35, 88, and now 96 nodes. It takes a lot of space and energy to set up to test that many nodes, even if you figure out the financial implications - would you want to sell 96 nodes as used equipment?

The StorageMojo take
Isilon has cool technology and a lead in a new market segment. They’re focused, smart, funded and growing fast. Are they invulnerable? Not even close. But the big iron vendors need to think seriously about the meaning of disruptive technology.

Isilon might be an object lesson.

Comments welcome, as always. Moderation enabled because moderation is a virtue, except in the defense of liberty.



Next Article »
StorageMojo RSS Feed May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007