The StorageMojo take on Boxwood
Let me get the negatives out of the way first:
- Boxwood is a prototype, not a product. While the BoxFS testing is suggestive, the real proof is in application, especially database, performance.
- An eight node prototype isn’t very large. The scalability questions aren’t answered to my satisfaction, though certainly the team is well aware of the issues. Again, suggestive, not conclusive.
- There is no evidence that Microsoft is productizing Boxwood.
On that last point, Chandra does note on his homepage that he’s been working on a project at Microsoft’s Hotmail unit, that a senior Microsoft exec acknowledged as a major undertaking for internal use only. Since Hotmail has rather famously been running on either FreeBSD or Linux, depending on whose report you believe, it seems likely he is helping Hotmail move to a Windows-powered infrastructure using Boxwood-like abstractions. If he can help Hotmail, perhaps someone in Redmond will wake up to the potential of this technology to further embed Microsoft in the data center.
Despite the negatives, Boxwood is a signal achievement for several reasons.
- A robust infrastructure using low-cost servers and networking with an API (discussed only in passing in the paper and possibly vestigial) designed for commercial application development.
- The explicit inclusion of database support through ACID transactions in the B-tree layer.
- The use of Windows at the server level. Potentially this makes scale-out clusters available for Microsoft’s huge VAR base.
Strategically, Microsoft is one of the only players in the industry who would look at destroying the array business as a revenue opportunity rather than a disaster. If they get hungry for growth, say if Vista, or some future OS, doesn’t spur business, the extra several billion a year in software revenue they could pull in might cause them to rationalize the pain they would cause their system vendor OEMs.
Is that screaming I hear in Hopkinton?
While such a scenario would hurt HP, IBM and Dell, it would absolutely devastate EMC, which has no server business to fall back on. EMC’s obvious response would be to buy a Linux server business and go into the software cluster storage business. Big iron arrays will never go away, but their growth could. If I were them, I wouldn’t wait.
Comments, as always, welcome. I’ll try to get something up about Storage Networking World tomorrow.
Robin ,
You say:
“Strategically, Microsoft is one of the only players in the industry who would look at destroying the array business as a revenue opportunity rather than a disaster.”
It goes without saying that Jim Gray & Gordon Bell are extremely ‘qualified’ to position this correctly.
However, such concept will not run on a backend of commodity motherboards with a few disks plugged in. Such hardware solution may be OK for Google, for ‘internal’ use and for the time-being.
This approach needs to be ‘standardized’ for wider OEM/Integrator appeal. I expect that Microsoft solution will run on well-designed specialized storage ‘bricks’, built around commodity processors. Intel is already putting a major effort behind such concept. Others will follow with copies and we can look forward to another Wintel ‘whitebox’ standard.
Also, you say …
“While such a scenario would hurt HP, IBM and Dell, it would absolutely devastate EMC, which has no server business to fall back on. EMC’s obvious response would be to buy a Linux server business and go into the software cluster storage business.”
I think that it will devastate all of them, as none of the above ‘own’ the operating system. There is not a lot of added value in commodity ‘storage bricks’.
The Linux open source community is not going to stand still, but there is an issue of centralized support, etc. RedHat needs to wake up, as this is a perfect opportunity for them, otherwise they will be up for sale. Most of the important technology is already in place (GFS, etc) and can be supplemented with some of the ‘big-iron’ features, which have always belonged closely coupled to the OS.
Richard, I agree with some of what you say and disagree with parts. I agree that Msoft certainly has the brain power to develop these technologies. Absent significant competition though I doubt they have the incentive, like everyone else in the storage industry.
I also agree that making these technologies “channel-safe”, whether OEM/Integrator/VAR, is a critical success factor. Also, I agree that Red Hat is a sitting duck and needs to get moving. They have the same opportunity in storage that Msoft does, but not the resources.
I disagree in a couple of areas, though. First, I *do* think, and I believe several companies have demonstrated, that commodity pizza-box servers with local disks can support most, and eventually almost all, high-end, high-performance applications, through clustering. With Amazon, Google and Msoft’s Boxwood – as well as some I haven’t written about – yet – it is clear that commodity-based cluster architectures can achieve massive scale-out in both throughput and bandwidth.
Second, and this is probably a partial disagreement, these features need to be OS-strength functionality, but they don’t need to be part of Linux or Windows. It is instructive that these internet data center clusters are built on top of server operating systems that handle node functionality while the COS – cluster OS – handles load balancing, file and storage access, failover and data protection. These are all functions we associate with single-node server OS’s, which suggests that the COS is, in fact, a new class of software.
As to whether commodity storage bricks, or as I would prefer to call them, commodity cluster bricks, will be profitable. Only with the right business model. Dell has done very well, until recently, with just such a model. With a focus on services, IBM and HP could do very well implementing these clusters without selling the hardware, especially if they develop their own enterprise-focussed COS to compete with Msoft.
Robin