White Box Arrays (WBA) today remind me of the early days of personal computing before Apple, Radio Shack and Commodore built the first appliance computers. People bought bare PC boards and a bag of components to build their own computers.
WBA’s aren’t nearly as tough as PCs were in 1975. There is fine hardware at reasonable prices. Multiple options for free or inexpensive software. Most importantly, people are building their own homebrew storage arrays because the economics are so compelling. Some are supporting tens of millions of users.
The White Box Array
There are many ways to skin this cat. Microsoft is certainly moving to make Windows Server a serious contender in the white box array market. Yet the open source community isn’t far behind.
Just as the white box PC market is about 30% of the current PC market, there is no reason that white box arrays, using free or low-cost software and commodity hardware could not win a big chunk of the SMB market through the traditional SMB channel of VARs. Especially given the terrific price umbrella offered by the name brands.
These systems all have some things in common:
- Commodity drives – usually SATA – and enclosures
- Ethernet-based, iSCSI or NAS
- Little or no traditional RAID gack: controllers, NVRAM, dual-porting
- Mostly or totally done in software
- High availability, resilient, and much lower cost
The hardest part is the software. All server O/S’s supports NAS and most iSCSI initiators that is all you need on the server side.
Where to Pitch In and Help
The core of open source software (OSS) is that volunteers do the work. Many of these folks work for companies that benefit from the OSS work, either as vendors or users, but not all.
A little terminology first. An initiator sits on the server and handles the server’s requests for data to the external data. The target sits on the external array and receives those requests. The file system (for lack of a better term, since these provide storage functionality far beyond what a traditional FS offers) keeps track of files and provides state-of-the-art features such as RAID, CDP and bullet-proof data integrity.
Here are some places to start looking if you’d like to help. While growing your expertise in the next big reseller opportunity.
Linux
The redoubtable Ben Rockwell put together this tutorial pdf for iSCSI on Linux.
The sourceforge Linux-iSCSI initiator project is active.
The iSCSI Enterprise Target project is also active. Fedora Linux actively supports iSCSI target mode.
ZFS is in the process of porting to FUSE/Linux. FUSE means Filesystem in User Space, and while FUSE is stable, some of the more persnickity Linux folks want to see it in the kernel. If you are an experienced kernal module writer who would like to cover himself in glory, porting ZFS to Linux would be a Good Thing.
BSD – Free, Net, Dragonfly
The BSD Unix community seems a little confusing due to the multiplicity of versions. FreeBSD and NetBSD share a lot of code, while Dragonfly is an ambitious new version intended for cluster, grid and SMP applications. NetBSD offers target-mode iSCSI tested to work with Microsoft iSCSI initiators.
The FreeBSD folks could use some help porting the NetBSD iSCSI target. If you’ve got serious Unix chops, or would like to, this is the place to cement your rep as a codeslinger.
Both FreeBSD and NetBSD could use a few good coders to take on porting ZFS. Dragonfly BSD is committed to ZFS, after they finish up the rocket-science under the hood supporting all their multi-processor coolness.
Microsoft
The Redmond Bruisers bought a leading iSCSI firm and are incorporating the code into Windows after their earlier effort failed to pass muster. They are very serious about leveraging their Windows monopoly to build a big storage business. If you like Microsoft you’ll love what they’re doing with storage.
Low Hanging Fruit
Just as white box PC’s have proven popular with cost-conscious SMB’s, so will white box storage arrays. With features as good as or exceeding what EMC|Dell offer, SMBs have a rare opportunity to take back significant street cred and revenue from the big guys.
These products will help the data-center focused big guys work harder for the SMB dollar. I expect to see a VC funded white box array startup in the next 12 months.
Can you explain what you mean by “Little or no traditional RAID gack: controllers, NVRAM, dual-porting”?
These white box vendors are just offering lower quality or is there a disruptive good enough technology to replace them they are offering.
Sure. Using file systems like MogileFS or ZFS – or, in Google’s case GFS, commodity hardware-based storage arrays are being built today. These “file systems” – we need a better name – obviate the need for things like volume managers and RAID controllers, or features, like dual-ported disks and NVRAM for many, but not all, applications, especially in the SMB space. By taking cost and hardware complexity out, you get a more reliable system at a much lower cost that is good enough for many apps.
Read my posts on GFS and ZFS to learn more.
I once suggested to a vendor that I figured I could build my own array with the stuff available these days – iSCSI target drivers, et al. He laughed in my face…
Just curious — what do you think about other GFS (Global File System by RedHat http://www.redhat.com/software/rha/gfs/)? Contrary to Google FS it is actually available and according to all (few) notes on the Web it works fairly well. Cannot it be used as proven and trusted base for your white-box array. How much really was ZFS in the real-world environment?
As I recall, Red Hat’s GFS used to be Sistina’s GFS. In this case the G stands for Global, rather than Google – which is why I suggested calling the Google File System GooFS – which a few people have picked up on. Onward. GooFS is an application specific design that is not usable for most non-Google apps. Its importance is that it demonstrates that an architecture that is software-centric can achieve very high availability and scalability.
RHGFS is a cluster file system. It may be a very good one, as we thought at YottaYotta when we looked at incorporating it into our cluster-based RAID controller to enable NAS services. The difference between RHGFS and ZFS is that ZFS is much more than a file system. It is also a radical storage manager: storage pools; variable width stripes; exceptional data integrity.
Its true that ZFS needs a thorough pounding in the real world. I’ve no doubt that problems will be found. Yet the fundamental architecture of ZFS embodies the radical rethink of the storage paradigm that is, AFAIK, unique to it and GooFS, and is vital to create the commoditzed storage world that vendors have been both dreading and preventing for the last 10 years.
That said, ZFS was designed for the real world by folks who have a pretty good idea of what that means. Teething pains, sure. Fatal flaws, highly unlikley.
Thanks