The CEO of a – I’m guessing here – specialized Internet services company wrote StorageMojo because of the recent IBRIX acquisition. Here’s what he asked, edited for space.
We’re IBRIX customers, and we’re building out a High Performance Cloud platform. With the acquisition by HP, I’m nervous about our future, as most of our infrastructure is Sun-based (not Solaris, but Sun hardware), not to mention the uncertainty around Sun’s future.
Lustre is not appropriate for our kinds of reliability, so we chose IBRIX. We have high density systems and a need for virtually limitless scalability down the road, but on a 2009 startup budget.
I have a couple of questions for you:
1) If we go down the IBRIX road, are we going to end up on HP’s rather expensive storage solutions (at least for HPC-style higher density solutions, it’s _exceedingly_ expensive to buy HP). It just seems like storage costs are very high with them until you hit very large purchases.
2) If we switched directions, and went to another storage system, what would you use? We’ve looked at Sun’s 7000 series, Isilon, DataDirect, etc (we are currently Infiniband-based). They all seem to be “very expensive” to scale out on, and we want something that we can squeeze a lot out of (IBRIX licensing allowed us to use larger hardware – Sun 4600s and commodity arrays – and that really helped with “license scalability”) as we grow.
The overall goal of this system is “Web Scale” storage with as little out of pocket [beyond] what we’d managed to negotiate already (the IBRIX + Sun t6140 arrays + thumpers exporting over infiniband).
Looking at HP
It is likely that HP will put IBRIX on its ExDS 9100, a rack packed with blade servers and high-density disk packaging. The current 9100 uses PolyServe, a cluster optimized for transaction processing, not scale-out. Haven’t looked at the numbers lately, but last year the 9100 was listed at less than $2/GB, cheap for storage from a top-tier vendor. As with all clusters, capacity is cheap and bandwidth dear, so performance needs affect the cost.
It would be wise for HP to also make the IBRIX software available on HP’s low-end rack servers. The 9100 is a nice box, but some are happy starting with 3 servers and a terabyte or 2 in SAS drives. HP should offer that option.
Looking beyond HP
As noted before, there are several scale-out cluster storage options. If density is critical, you could do worse than Verari Systems paired with their Data Valet cluster software. They say their configs start at less than a dollar/GB.
Parascale, like IBRIX, offers a software-only product that you can mount on whatever servers make sense for you. They favor larger file sizes which may affect your thinking.
Nexenta is worth a look as well, since they use the ZFS you already know.
Who am I leaving out?
Update:Readers suggest Panasas – how could I forget? – Gluster and 2 votes for Quantum’s StorNext. More detail in the comments. I’ll keep updating this as more suggestions come in. End update.
The StorageMojo take
Scooping up IBRIX is good for HP, but if they limit what you can do with the software they won’t be doing us any favors. “Solutions” are a wonderful thing, but some of us just need some 2x4s, plywood and nails to build a solution.
I hope IBRIX gets more resources to drive their development faster. They have a good platform that deserves a chance to grow.
Courteous comments welcome, of course. I’ve done work for most of the vendors mentioned above.
how about glusterfs (http://www.gluster.org) ?
Panasas?
I’d think StorNext would be a good option. Like Ibrix, they have an IP capability, so you can have your clustered storage over ethernet. Unlike Ibrix, they support a striping mode if you want it, kinda like Lustre, although certainly not so high-thruput, but you probably don’t care about the limitations there. Also, StorNext has second-to-no-one storage tiering capabilities.
If you are interested in SW-only performance clustered file systems, one may wish to look towards IBM GPFS, also, to compare prices. Off in open sauce land, I’m hearing that Gluster is on the upswing (dodges the metadata controller problem), although I can imagine that an only-recently upscaled open source platform is not the thing to use for someone stressing over availability.
Is there some reason you haven’t gone the Red Hat GFS route?
As a side note, if your Sun HW is x86 stuff don’t stress it. x86 stuff is “commodity,” which is to say to me, one slab of x86 pork is like another. If your systems were all white boxes, would you be stressin’? No.
Joe Kraska
San Diego CA
USA
Quantum’s StorNext could be an option. Fundamentally, StorNext provides a high performance shared File System with policy-based tiered storage management. It does this with what we call ‘preservation of choice’ since StorNext is agnostic of Operating System (Windows, Linux, Unix, Mac OS), protocols (FC, iSCSI, IP, Infini-Band), disk subsystems (EMC, HP, HDS, IBM, Sun, etc.), as well as tape platforms (Quantum, Sun, IBM, etc.).
Given the emailer’s rightful concern about long-term issues like hardware roadmap and vendor support, I’d encourage him/her to include a thorough evaluation of both CAP and OPEX in their decision-making process, as well.
It is important to assess CAP and OPEX prior to any purchase, as systems with lower initial acquisition costs may often require significant investment down the road in management, professional services and pricey upgrades. These “hidden costs†frequently add up to be much more costly over time than deploying systems that may appear slightly more expensive at the outset, but require very little time and resource investment to manage, thus delivering better TCO and ROI over the short and long term.
– Lucas Welch, Isilon Systems
FWIW: our high performance/high density storage clusters are designed with HPC in mind, start out (well under) $1/GB, and offer some of the best IO bandwidth per box (storage-wise and networking) you can get anywhere. We have partners building these out into large iSCSI targets for their HPC and storage clouds, partners deploying them in clusters, and we have some nice GlusterFS work coming out soon (won’t be able to talk about it for a little bit).
We demonstrated 1GB/s NFS a few weeks ago (no press release or anything like that) for 1 client and 1 server over a single 10GbE NIC. Gluster lets us do this over 10GbE, and Infiniband (which does not appear to be on the wane in the least in HPC).
If you are open to other-than-big-names, you have a pretty good shot at solving this problem. If you are stuck with only big names, you have to take what they can deliver.
As a StorNext end user, I’ll add to Joe’s/Lance’s comments:
What you’ve likely lost with the IBRIX acquisition (and what StorNext has committed to delivering through years of product evolution and acquisition, e.g. ADIC –> Quantum) is agnosticism. StorNext has allowed us to scale a HPC/HA computing architecture from a handful of TB to over 1.5 PB. During that time we have moved data centers twice – once across campus, then across metro. We have integrated or migrated off of at least 4 separate brands of storage subsystems and have a hybrid of multiple vendors/tiers of storage in a single, managed namepace. With proper design and engineering, performance on the filesystem expands with capacity growth. And integration of deduplication in the Storage Manager has saved us 7-figures in CAPEX on nearline capacity.
Definitely worth considering if you’re uncomfortable with the potential vendor lock-in presented by this acquisition.
-DG
USA
Just some other thought and comments about this
Keep in mind that Ibrix is a NAS only solution.
Like Nexenta, Exanet, etc.. All are great but Scalable NAS solution that can run on any commodity hardware.
You have some Internet oriented file systems for cloud storage like HDFS (Hadoop) and more..
Solutions Like Lustre, Gluster, GPFS are tuned for the HPC world (although GPFS actually started as a media file system). They are SAN/Ethernet based shared file systems.
This means they are tuned for HPC style of IO characteristics of large throughput, mostly sequential. They are not tuned to handle small/random IOps.
StorNext is heterogeneous file system that gives you shared SAN based file system with NAS capabilities, or their Own propriety protocol (DLC).
The Great thing about StorNext is its ability to handle different kind of work flows, and provide both SAN & NAS access, and handle larger variety of IO patterns. It does have its weaknesses , where lacking on Functionallity (snapshots for example, NAS features..) but overall it is a good, easy to manage product.
So my bottom line here, is that you have to clarify to us & your team what kind of usage and IO characteristic are you looking to get from this file/storage solution & than choose the right product to do it, and not vice versa..
You need to pin-point the requirement, and than see how each of the suggested solutions can/cannot answer these needs.