StorageMojo




Robin Harris    


Cloud computing is foggy thinking

January 27th, 2008 by Robin Harris in Architecture, Future Tech

A particularly odd bit of goofiness has hit the infosphere: cloud/utility computing mania. Nick Carr has written a book - a sign of the Apocalypse . IBM has announced, for the umpteenth time, a variation on utility computing, now cloud computing. Somebody at Sun is claiming they’ll get rid of all their data centers by 2015.

R-i-i-i-ght.

You know the flying car in your garage?
The syllogism is

  1. Google-style web-scale computing is really cheap
  2. Networks are cheap and getting cheaper fast
  3. Therefore we’re going to use really cheap computing over really cheap networks Real Soon Now

Can you spot the fallacies?

Fallacy #1: Google is Magick
The world’s largest Internet advertising agency does have the cheapest compute cycles and storage (see my StorageMojo article Killing With Kindness: Death By Big Iron for a comparison of Yahoo and Google’s computing costs). But they do nothing that the average enterprise data center couldn’t do if active cluster storage were productized.

Google built their infrastructure because they couldn’t buy it. They couldn’t buy it because no one had built it. But all Google did was package up ideas that academics had been working on, sometimes for decades. Google even hired many of the researchers to build the production systems. Happy multi-millionaire academics today!

Blame vendor marketing myopia for missing that opportunity. But their eyes are wide open now. If your enterprise wants cluster computes or storage you can buy it.

Fallacy #2: Networks are cheap
Or they will be Real Soon Now.

10 Mbit Ethernet from Intel, DEC and Xerox came out in 1983. A mere 25 years later we have 1000x Ethernet - 10 GigE - starting down the cost curve.

About the same time a first generation 5 MB Seagate disk cost $800. Today a 200,000x disk - 1 TB - costs 300 vastly cheaper dollars.

Also in 1983 the “hot box” - the VAX 11-780 - with a 5 MHz 32-bit processor and a honking 13.3 MByte/sec internal bus cost - for you, a special price - $150,000. Today a 64-bit, 3 GHz quad-core server - with specs too fabulous to compare - is $1300. Call it 1,000,000x.

Networks are the laggards. Which is why Cisco commands such a premium over the folks who do their jobs so much better: networks are the bottleneck. Optimizing the bottleneck has an incredible payback.

Hey, Cisco! Get the lead out!

What’s really going on?
There are - currently - economies of scale, which Google is exploiting and MSN and Yahoo! aren’t. So the latter two are going out of business.

But when you look at the cost of going across the network compared to the rest of infrastructure you realize that local - what we used to call distributed - computing is the only way to go.

Ergo, cloud computing will remain in the clouds and real computing will remain local.

Comments welcome, as always.

Vendors beware: the buyers are restless

January 23rd, 2008 by Robin Harris in Enterprise, Future Tech

A recent study by The Info Pro research firm suggests that some seismic shifts are underway. Is EMC losing top-of-mind recognition in the data center? Are mid-size enterprises more likely to embrace new technology?

In an article in Data Storage Connection, TIP talks about some of its findings from a series of interviews with a couple of hundred data center denizens. TIP runs the series about every 6 months. About 150 were F1000 types and another 85 were mid-size enterprise.

Naturally, the article and the accompanying slide presentation are designed to sell the report, but it is worth watching for marketing mavens. They focus on what people consider “exciting” technologies.

Random comments
In no particular order:

  • EMC’s unaided top-of-mind seems to be on a steady downward slide. One might have thought the hype around VMware would have changed that. Of course the stock market seems to forgotten that too.
  • Newbies 3Par, Data Domain, F5, Compellent, and Isilon are trending up.
  • Biggest surprise: HP is in the weeds behind behind Data Domain, Sun and F5.

The StorageMojo take
EMC may be paying the price for all of its not-terribly-storage-related acquisitions like RSA and VMware. Or maybe its aging architectures are taking their toll. Whatever it is, the upcoming Hulk/Maui launch is a chance to burnish the corporate image. But not too brightly since the v1 software will be weak.

HP is clearly in trouble. I’m biased - I shipped the very first StorageWorks product back around ‘91 - but for an organization that used to have bright and creative developers and good marketing, their top-of-mind stinks. Time to shake up HP storage marketing: there’s an art to marketing to and through a large direct sales force. Megatons of brochures don’t sell products - people do.

The newbies seem to be doing well in mid-size enterprises where they can more easily migrate to a new vendor. But while the glass house grinds slowly, entrenched vendors can be displaced. Time to rethink the value proposition.

Comments welcome, of course.

Magic in the OLPC

December 15th, 2007 by Robin Harris in Future Tech, Information Management

Most criticism of the One Laptop Per Child PC centers on the cost for what is a low-spec computer. As ASUS with its Eee machine is proving, a low-cost conventional laptop can be pretty powerful. But that misses the point. The OLPC is a fundamental rethinking of the computing experience.

[photo courtesy OLPC]

This child’s review of the OLPC is the first hint that suggests that Laptop.org may have gotten it right. As the 9 year old’s father writes:

So Rufus is using his laptop to write, paint, make music, explore the internet, and talk to children from other countries.

Because it looks rather like a simple plastic toy, I had thought it might suffer the same fate as the radio-controlled dinosaur or the roller-skates he got last Christmas - enjoyed for a day or two, then ignored.

Instead, it seems to provide enduring fascination.

I had returned from Nigeria not entirely convinced that the XO laptop was quite as wonderful an educational tool as its creators claimed. I felt that a lot of effort would be needed by hard-pressed teachers before it became more than just a distracting toy for the children to mess around with in class.

But Rufus has changed my mind.

With no help from his Dad, he has learned far more about computers than he knew a couple of weeks ago, and the XO appears to be a more creative tool than the games consoles which occupy rather too much of his time.

OLPC roots
Even though the OLPC is the only notebook whose industrial design chops rival those of Apple, its real innovation lies in software. Building on educational theorist Seymour Papert’s work - he invented the Logo language - the OLPC’s re-thinks the relationship between man and machine.

OLPC differences
The OLPC has activities instead of applications.

Activities are distinct from applications in their foci—collaboration and expression—and their implementation—journaling and iteration.

The collaboration comes in the form of built-in mesh networking that allows all local OLPCs to talk to each other.

By exploiting this connectivity, every activity has the potential to be a networked activity. We aspire that all activities take advantage of the mesh; any activity that is not mesh-aware should perhaps be rethought in light of connectivity. As an example, consider the web-browsing activity bundled with the laptop distribution. Normally one browses in isolation, perhaps on occasion sending a friend a favorite link. On the laptop, however, a link-sharing feature integrated into the browser activity transforms the solitary act of web-surfing into a group collaboration.

The connectivity seems to be powerful. Young Rufus is conversing with other kids who send him messages in Spanish from his home in England. How does that work?

Expression is the goal of the activities and collaboration. Rather than downloading music, the laptop is equipped to create music. The rethinking extends to the file system:

The objectification of the traditional file system speaks more directly to real-world metaphors: instead of a sound file, we have an actual sound; instead of a text file, a story. In order to support this concept, activity developers may define object types and associated icons to represent them.

Another aspect of the system’s UI is a focus on the Journal. This is more than written documentation of what a child has done.

The Journal combines entries explicitly created by the children with those that are implicitly created through participation in activities; developers must think carefully about how an activity integrates with the Journal more so than with a traditional file system that functions independently of an application. The activities, the objects, and the means of recording all tightly integrate to create a different kind of computer experience.

I’ll be interested to see how children who grow up with the OLPC think about computers. I fear we have a generation of children whose creativity has been permanently stunted by the desktop metaphor.

The StorageMojo take
Negroponte’s biggest mistake is that he did not market the OLPC in the industrialized world first. All the good intentions in the world won’t convince the 3rd world that something is good unless it has been embraced by the opinion leaders of the 1st world.

If I was Steve Jobs, I’d be taking a very close look at this machine to see what I could steal. Michael Dell could learn a few things too.

Comments welcome. OLPC has a beautiful web site.

Save Internet freedom - from telcos, for users

December 13th, 2007 by Robin Harris in Architecture, Future Tech, Security & Public Policy

Mighty Google is worried about getting the shaft from telcos. Shouldn’t you be too?

Larry Downes imagines the worst
Larry Downes’ arguments against net neutrality are button-pushing propaganda designed to inflame, not illuminate. I expect better from a University of Chicago trained lawyer.

In response I’m going to look at the text of a net neutrality proposal and then at Mr. Downes’ mostly irrelevant points.

What is being proposed?
Let’s start with Congressman Markey’s proposed Network Neutrality Act and decide for yourself? The PDF is only 11 pages, while the dread regulations are barely 4 pages.

Here are the core “regulations” Mr. Downs is so afraid of. From the bill, Internet providers may

not block, impair, degrade, discriminate against, or interfere with the ability of any person to utilize their broadband service

for lawful content, applications and services. I expect no less.

Furthermore, service providers are required to

clearly and conspicuously disclose to users, in plain language, accurate information about the speed, nature, and limitations of their broadband service

Truth-in-advertising? Telco marketing will never adapt!

How about this requirement?

offer, upon reasonable request to any person, a broadband service for use by such person to offer or access unaffiliated content, applications, and services

Requiring telcos to take new customers? Tricksy Mr. Markey.

Here’s what gets the telcos mad
The bill requires that a telco

not discriminate in favor of itself in the allocation, use, or quality of broadband services or interconnection with other broadband networks

Isn’t that a Communist common-carrier requirement? Gee, why own a big network if you can’t screw your competitors? No wonder the telcos are miffed.

This gets them madder
Broadband service providers will be required to:

offer a service such that content, applications, or service providers can offer unaffiliated content, applications, or services in a manner that is at least equal to the speed and quality of service that the operator’s content, applications, or service is accessed and offered, and without interference or surcharges on the basis of such content, applications, or services

Hm-m? Requiring equal treatment of unaffiliated content? Just like telegraph companies had to 160 years ago? Medieval.

Now telcos see red
Here’s the heart of the matter. The law would require that

if the broadband network provider prioritizes or offers enhanced quality of service to data of a particular type, prioritize or offer enhanced quality of service to all data of that type (regardless of the origin of such data) without imposing a surcharge or other consideration for such prioritization or quality of service

[emphasis added]

The heart of the matter
The telco can charge for more, time, speed or bandwidth, but they can’t charge more for preferential treatment of packets. This is what being a common carrier means.

The Downes critique, fearlessly knocking down straw men
Larry’s article is mostly a big cloud of smoke, irrelevant to the question of net neutrality:

  • Railroad asset accounting has nothing to do with treating packets equally
  • Airlines wanted the CAB’s regulation and fought to preserve it to avoid competition
  • SOX addresses another financial accounting problem

There are many examples of regulation that works: the drugs we take; the airlines we fly; the building codes that make our homes, offices, schools and factories safer.

Network designers demand non-neutrality?
Mr. Downes then concludes that net neutrality would stymie web engineers efforts to optimize Web traffic.

He might be referring to Bob Briscoe’s IETF problem statement We Don’t Have To Do Fairness Ourselves which discusses the unfair use of TCP, a protocol designed to be fair. Briscoe says the IETF needs to:

. . . focus on giving principled and enforceable control to users and operators, so they can agree between themselves which fair use policy they want locally.

This is very different than giving the telcos a blank check to impose anything on a captive audience of Internet users. All our history with monopolies and duopolies tells us that without basic ground rules the telcos will ream the users.

The deep end
Then Mr. Downes goes off the deep end, positing that a complaint would force the FCC to open every affected packet on the network to determine if a telco were violating the law. This is silly.

It would be far easier to monitor a sample of disputed traffic as it is injected and measure its performance across the network. But how likely is a complaint if the telcos are prohibited from discriminatory treatment? Why would they develop the ability?

What is much more likely is that a telco whose unpopular policies have alienated the public would want government protection. Politicians would provide protection - for a price - such as ready access to the databases that store your surfing habits.

The StorageMojo take
Ultimately, net neutrality is a choice between private exploitation of network users by opaque, profit-driven companies or publicly debated ground rules that set minimum standards. The telcos and their claques whine about how hard all this is, but I’m confident the engineers can solve the problems.

Mr. Downes - like George Ou - doesn’t address the issue of fairness between users and providers. If Google is worried about getting reamed by telcos, why aren’t you?

EMC’s Maui and everybody else

December 12th, 2007 by Robin Harris in Backup, Clusters, Enterprise, Future Tech

For some reason I volunteered to write something about vendors after the Wikibon con call today. That follows.

Vendors: responding to EMC’s cluster storage initiative

Context:
EMC’s support of cluster storage for archiving and backup will legitimize the technology. Vendors with competitive products have a window of opportunity to position themselves as a superior alternative. Make no mistake: EMC plans to own this market and will commit significant resources to the effort.
EMC’s market entry will be hobbled by several problems that competitors can exploit.

  • Immature software: limitations, bugs and the eval cycle that implies
  • Maintaining a bright line positioning between Hulk/Maui and Symms
  • 60% gross margin requirement

EMC will be NDA’ing strategic customers starting mid-January to build major sales to reference at announcement. Smart customers will be calling other vendors, including the smaller, innovative ones, for perspective. Luck favors the prepared.

Strategy:
IBM, Hitachi, HP, NetApp IBM Global Services should be open to reselling/integrating suitable substitutes. There are efforts within IBM’s storage group to create a scalable, commodity storage infrastructure, but the chasm between IBM’s brilliant technologists and IBM marketing makes success problematic.

Hitachi doesn’t seem to be doing anything in this area. They will be looking at an acquisition and will take their time.

HP’s Polyserve acquisition may convince them that they have the cluster thing under control, but Polyserve isn’t competitive with EMC’s initiative. HP has a deep well of technology expertise from the DEC cluster products. Expect a cluster acquisition in 2008.

NetApp is vulnerable. ONTAP GX has missed the cluster market and their controller-based architecture has all the cost disadvantages of traditional arrays without the flexibility of clustering. Putting ONTAP 7G on commodity hardware bricks with software “mortar” - as Google does with GFS - would preserve their significant advantages with WAFL at a lower $/GB.

New competitors
Now is the time to get serious about what your product really does and what its appeal is to customers. Focus is critical to building a defensible position that can be used to win F500 business in areas where EMC is less competitive. There is also an opportunity to shift the terms of the customer debate. This market is still fluid and customers don’t have a clear mental map of the terrain. Smart, focused marketing can take advantage of that.

Action Item:
Small/new vendors: if you want to be acquired, now is the time to be shopping yourself to the big guys. If you want to build a big business, get your marketing focused on verticals and business justification.

Big vendors: start shopping now. EMC wants your scalp so you’ll want to be well-armed.

All: there is a lot more to know about Hulk/Maui. A focused competitive analysis effort will pay dividends.

Update: The audio is available here. If you are wondering if I mentioned your company, I probably didn’t.

Comments welcome, of course.

EMC’s Maui and the future of clustered storage

December 10th, 2007 by Robin Harris in Clusters, Enterprise, Future Tech

Here’s an invite to a Wikibon Peer Incite discussion I’ll be leading
Tuesday, 11 December. Call in on Skype and listen and ask questions. EMC’s AR and competitive analysts will be there. Shouldn’t you?

If you aren’t hip to Wikibon it is worth checking out.

Here’s the blurb from the invite with the call in number and passcode.

This is a reminder that the next Peer Incite research meeting is scheduled for Tuesday December 11, 2007. The topic for this meeting is: EMC’s Maui and the Future of Clustered Storage.

Google’s development of one of the world’s largest storage infrastructures based on commodity components, without reliance on traditional array technology, was a huge wakeup call for the storage industry in general and EMC in particular. Recent comments by EMC’s CEO Joe Tucci indicate two new products from the company, Hulk and Maui will address the market for so-called ‘Cloud Computing’ and hit the market in mid-2008. It is estimated that 85% of corporate data is unstructured yet organizations continue to spend billions optimizing storage for the 15% of information that is traditional database-oriented. Will this continue to make sense?

Key issues we’ll address on the call include:

  • What does it mean to users that EMC is about to legitimize clustered storage?
  • What do these advancements mean for user investments in traditional array technology?
  • How will the industry likely respond to EMC’s attempt to lead this trend?

Here’s how to participate in the discussion:

  • Date: Tuesday December 11, 2007
  • Time: 12:00pm EST (9:00am PST)
  • Call in #: 218-486-1300 Passcode: 509215

Moderator: Peter Burris (http://www.wikibon.org/User:PBurris)

See you at the meeting.

The StorageMojo take
The IT consulting business - as in Gartner and everybody else - needs a good shaking. Wikibon is a good idea that may be part of the solution. Let me know what you think after you check it out.

Update: If you miss the call check out the Wikibon archive.

Comments welcome, of course

Internet video’s performance/quality vise

December 5th, 2007 by Robin Harris in Future Tech, Information Management

Internet video is about where film was 100 years ago
I was talking to a company who will be announcing a video infrastructure solution when the CEO mentioned something he called the “video performance/quality vise.”

Here’s the problem: a video stream requires both capacity and bandwidth. Higher quality video requires more bits per second and more capacity. Bandwidth and capacity both cost money.

So as Internet video quality rises, the financial cost to provide the video rises too. An HD video stream is 4 Mbit/sec.

500,000 channels and somethin’ on
As cute as YouTube, et. al. are, they suck. Movies are small, picture and audio quality awful, and viewing options limited - like films 100 years ago.

Bandwidth limitations are part of the problem, at least here in the US. But those are being addressed, however slowly.

What happens when Internet video becomes competitive with broadcast TV in quality? Popularity will soar. As TiVo has shown, people love choice. And the Internet will have the most choice.

The price/performance/popularity vise
Digital Fountain’s raptor codes will change the Internet landscape for video. High quality video will drive be much more popular, just as long-form movies took film to the next level.

Bandwidth costs are dropping fast to pennies a GB. So infrastructure costs - especially storage - are critical to Internet video’s commercial success. The more popular it gets, the more storage will be needed. It is a huge opportunity.

The StorageMojo take
Massive data storage is still a very young technology. The ultimate cultural impact will be more profound than film because of the many-to-many nature of the Internet and the low barriers to entry. Should be fun!

Comments welcome, please. I don’t think the firm wanted me to mention their name, so I haven’t. If we get that cleared up I’ll update the post. Or maybe wait a while to write about them.

Update: Joe, thanks for catching the 4Mbit mistake I made. I corrected it above.

Storage is power

December 3rd, 2007 by Robin Harris in Future Tech, Security & Public Policy

Not “knowledge is power” or “information is power.” If you can’t store it, search it and retrieve it, you’ve got bupkis, friend.

Massive storage is a double-edged sword
And we’ll be forever in sorting it out. Cases in point from the Volokh Conspiracy a legal blog:

  • A gang of bank robbers used text messaging to plan their crimes. The prosecution subpoenaed the content of their text messages from the service provider, who evidently keeps them all. The defense says that’s wrong: text messages are speech and therefore need a warrant, not a subpoena, to access. Are text messages records or speech?
  • A North Carolina judge and candidate for re-election has evidently had a YouTube video depicting him in ethically questionable behavior pulled. Should politicians be able to hide such information from the public?
  • Should the government be able to subpoena Amazon for the customer records of a merchant believed to be evading taxes? The prosecutor, judge and the poster seem to be out of line: surely there are other ways of tracking Internet-derived income - like credit card or PayPal payments to the merchants. Why involve the buyers at all?

The StorageMojo take
It is tempting to think of massive storage as culturally neutral, since it is only storing what people produce. But just as the printing press helped broaden literacy and fueled the scientific revolution of the 17th century massive storage broadens access to information in several ways.

  • As Gordon Bell is showing, we will soon be able to record every waking moment of every person’s life. How should that data be used, and who should use it?
  • Massive storage enables scientific advances that use statistics to tease out the truth. Like Partial Response, Maximum Likelihood (PRML) and those CERN shots, is reality merely probable?
  • The courts will soon twig to the fact that it is cheaper for companies to keep all their electronic data than it is to keep all the paper that has been required for many decades. Highly intelligent search will be required to make sense of it all. “Corporate responsibility” will take on a whole new meaning.

Comments welcome.

OpenSolaris: the universal storage platform?

November 29th, 2007 by Robin Harris in Enterprise, Future Tech

There’s a dark horse coming up on the outside
Isn’t Sun - and Solaris - almost dead? No and they’re showing quite a bit of life in the storage arena. It is amazing what a $12 billion company can do with a unique strategy and deep engineering smarts.

One big change: after winning the 1.6 billion dollar anti-trust settlement against Microsoft, including a 10 year cooperation agreement, the 2 companies have embraced each other in ways - like storage - unthinkable 10 years ago.

CIFS support in the Solaris kernal
Sun’s steadily falling attach rate led me to give up on Sun as a storage vendor. But the new CEO, Jonathan Schwartz, has a new storage strategy: move storage functionality into the operating system. And there is a VP of Solaris storage software, Bob Porras.

The latest piece: CIFS. For years I’ve listened to engineers moan about the pain of implementing CIFS on non-Windows systems. Now I know why.

In a blog post Sun engineer Alan Wright explains:

There is a common misconception that Windows interoperability is just a case of implementing file transfer using the CIFS protocol. Unfortunately, that doesn’t get you very far. Windows interoperability also requires that a server support various Windows services, typically MSRPC services, and it is very sensitive to the way that those services behave: Windows interoperability requires that a CIFS server convince a Windows client or server that it “is Windows”. This is really only possible if the operating system supports those services at a fundamental level.

Solving those issues required 180,000 lines of new code in Solaris.

It gets better
They also made changes to ZFS (see my ZFS: Threat or Menace?) to support CIFS:

  • Support for DOS attributes (archive, hidden, read-only and system)
  • Case-insensitive file name operations.
  • Support for ubiquitous cross-protocol - NFS and CIFS - file sharing.

Check out the storage community at OpenSolaris to see what else is cooking.

The StorageMojo take
OpenSolaris is becoming the finest storage platform out there. Adding CIFS support to the kernal is a Big Deal: OpenSolaris will be industry’s first OSS universal storage platform.

Only a company with nothing to lose in the traditional big iron storage business could be so bold. My hat is off to Jonathan and Bob.

Update: For more detail on other SMB related changes, check out Doug McCallum’s Share Manager blog.

Comments welcome, as always. Yes, I know the Samba guys aren’t happy. One of these days I’m going to tackle the GPL vs CDDL thing and see if I can make any sense of it. I’m also wondering where all the givebacks are from the storage companies using OSS in their products.

Linux needs Open Source Marketing

November 21st, 2007 by Robin Harris in Future Tech

The limits of open source engineering
Hang with engineers for a while and griping about marketing is inevitable. The 3 Margarita lunches, the plush globetrotting, the hotties in Marcom and worst of all, they don’t understand the product.

Next bench marketing
Many engineers are quite good at marketing - to other engineers. HP got started with products by engineers, for engineers. Much of Silicon Valley is built on engineers building stuff for other engineers.

Engineers understand each other. They know how their minds work. Their problem-solving skills work in harmony. They know the problems and how to talk about them.

From a marketing perspective it is just about perfect.

Linux on the desktop
Where it falls apart is when a technologist is marketing to a non-technologist. Linux server vs Linux desktop.

Linux is huge in the server market. The developers and sysadmins who work on Linux have good communication. Red Hat engineers see the customer problems through RHEL support.

On the desktop, not so much. There is no Steve Jobs flogging designers and developers to make everything as simple as possible - but no simpler! - with good esthetics.

Consumer Linux so far
Linspire and Ubuntu have reportedly done a fair job in creating non-Linux guru versions of Linux that come with the codec, Wi-Fi, and application support that most users expect. How about taking people *beyond* what they expect?

Is there a Linux iLife equivalent? Or adding a suite of creativity apps for content creation, such as FreeMind (mind mapping), GIMP and Celtx (video production)? Your Linux guys will say “so what, it is easy to add those apps if you want them.” The marketing guy says “creating the perception of value is even more important.”

The prevailing thought seems to be Windows and Office based on a naive assumption that since everyone else runs Windows, that is what Linux desktops should look like. How about being *better* than Windows/Office?

What could be better?
Why not rip off the Mac instead? Most of the Windows auxiliary applications like Notepad are pretty lame. Surely Linux has superior versions. Then you could go from “looks like Windows” to “better than Windows” as a tagline.

The StorageMojo take
Linux on the desktop means Linux marketing that is target customer driven. Developers and/or Linus aren’t the people to drive this. It will take someone with money. Red Hat? Ubuntu?

It will also mean some changes in Linux governance, such as a “Real Linux Desktop” UI certification to start developing a more compatible look and feel across apps. I imagine getting Linux developers to buy into such a thing would be like herding cats. But if the Linux community wants a desktop presence they need to set their sights higher than Windows.

Comments welcome, as always. I was influenced by Walt Mossberg’s Linux review of a few weeks back, as well as my own use of OSS apps on my Mac. I’m thinking I’ll get an Eee early next year, which will really get me into Linux, ready or not.

Would you use OSS storage?

October 26th, 2007 by Robin Harris in Enterprise, Future Tech

Amazon and Google have demonstrated commercially - as have a number of research projects such as Microsoft’s Boxwood - it is possible to build highly available storage out of commodity servers. The odd thing is that, AFAIK, there is no commercially supported open source software for people to try.

Hadoop hews too close to the Google model for most commercial applications. Lustre has a lot of great technology and the management complexity to back it up.

Where are the VC’s?
There are a number of venture-backed OSS companies. XENsource in virtualization. Vyatta in routing. But nothing in storage. I can’t figure out why.

Is there no demand?
I’m trying to think through the issues. I see a few possibilities:

  • No VC believes OSS storage clusters are usable in commercial operations.
  • No VC believes glass house IT will use them.
  • VCs believe the big guys, like Sun, will own the market.
  • No VC is smart enough to see the market. Storage arrays are only about $30 billion a year. Who could build a business on that?

None of those explanations are very satisfying. What do you think?

The StorageMojo take
It’s a mystery. OSS + commodity hardware = new value prop. A company devoted to developing and supporting that software would seem to be a natural. And yet - no such company.

Comments welcome, especially on this topic. What would it take for your company to use OSS storage? Where would you use it? Would you buy support for it?

Sun fires back at NetApp

October 25th, 2007 by Robin Harris in Future Tech

Sun’s CEO, Jonathan Schwartz, has fired back at NetApp’s patent suit against Sun over ZFS, the advanced file system that promises to markedly increase data integrity for Sun and Apple users.

From the “best defense is a strong offense” playbook
No court documents to look at yet, but in his blog Schwartz lays out a multi-pronged attack on NetApp:

  • “As a part of this suit, we are requesting a permanent injunction to remove all of their filer products from the marketplace, and are examining the original NFS license - on which Network Appliance was started.”
  • “In addition to seeking the removal of their products from the marketplace, we will be going after sizable monetary damages.
  • Sun “. . . will continue to fund the aggressive reexamination of spurious patents used against the community (which we’ve been doing behind the scenes on behalf of several open source innovators).”

Update: David thoughtfully provided a link to Sun’s legal response. Thank you, David!

Playing to the galleries
Jonathan’s post is aimed at the open software crowd, not NetApp. He wants to turn this into a public relations war over NetApp’s support of open source. That is a battle that NetApp can’t win for (at least) two reasons:

  • Businesses worldwide are realizing that open source software is a Good Thing. By casting NetApp as an anti-OSS ogre, and Sun as a valiant defender, Sun strengthens its public image while tearing down NetApp’s. The IT pros who evaluate and recommend storage buys won’t soon forget who the bad guy is.
  • WAFL, NetApp’s file system, is based on a lot of prior art. If I read Jonathan correctly, that will be abundantly documented over the next year. Suing people over stuff you didn’t even invent makes you look like a whiner while it undermines your high-tech credentials.

The StorageMojo take
NetApp, like all the “big iron” storage vendors, is facing a soft market. No doubt the ZFS suit seemed like a good idea at the time, but NetApp has much more immediate problems. They need to get serious about settling this or their good public image will soon be permanently sullied.

With EMC set to announce storage clusters next year - sources tell me there’s been some slip, but it is still an ‘08 announce - NetApp is going to be facing an even more difficult and dynamic environment. ZFS is a threat of sorts, but storage clusters are the gathering storm for all NFS vendors. Maybe it is time for some new blood at NetApp.

Comments welcome, of course. A note to my NetApp readers: it seems like I haven’t written about NetApp for months and then boom! this week it is all NetApp all the time. And the EMC guy just had to pile on in the comments. It is all coincidence, I assure you.

Update II: Dave Hitz has a thoughtful - expect no less from a Princeton grad - response to the Sun suit announcement.

Mac ZFS debate

October 15th, 2007 by Robin Harris in Architecture, Future Tech, Information Management

I’ve been a fan of ZFS since I researched it over a year ago. I’ve also been happy with the progress ZFS is making on OS X.

So it was a bit of surprise when I saw (thanks Wes) that MacJournals, a developers web site, was all sideways about it.

A good conversation
Fortunately a former Mac file system developer, Drew Thaler, responded with Don’t be a ZFS Hater.

Another respected Mac developer, Michael Tsai, also responded with a thoughtful post.

The StorageMojo take
I follow the ZFS discussion on OpenSolaris, so I understand that the ZFS implementation has a ways to go. From a marketing perspective, ZFS or something like it is required if consumers are going to use computers as media centers for purchased content. Seeing a couple of thousand dollars worth of music, TV, movies and videos go poof! is a sure way to get tossed out of America’s living rooms.

I believe Apple developers have the Mojo to make ZFS use transparent for Mac customers. They certainly have the help of the Sun team and it is in the interest of both companies to make this work. Plus, don’t forget Apple’s “touchless” file system upgrade patent.

But MacJournals correctly points out that UFS was once thought - though without the level of support ZFS has enjoyed to date - to be the successor to HFS+ and that a similar fate may befall ZFS. While that is certainly a possibility - never say never around Steve Jobs - there are good business and marketing reasons for going forward with ZFS, regardless of what techies think. Apple will go forward with ZFS and make it the standard OS X file system within 2 years.

Comments welcome, as always.
Update: I’ve started editing comments on this post to keep them on topic and away from personalities. I regret not doing so sooner. Nonetheless the discussion is informative and if file systems interest you, well worth perusing.

pNFS technical intro

October 15th, 2007 by Robin Harris in Architecture, Clusters, Future Tech, NAS, IP, iSCSI

I don’t normally link and run but this is a good article on the Next Big Thing in NFS v4.1.

Written by 3 NetApp engineers, Garth Goodson, Sai Susarla, and Rahul Iyer, Standardizing Storage Clusters offers a good overview of what’s new. It’s on the ACM Queue web site.

If paragraphs like

protocol operations

The pNFS protocol adds a total of six operations to NFSv4. Four of them support layouts (i.e., getting, returning, recalling, and committing changes to file metadata). The two other operations aid in the naming of data servers (i.e., translating a data server ID into an address and getting a list of data servers). All the new operations are designed to be independent of the type of layouts and data-server names used. This is key to pNFS’s ability to support diverse back-end storage architectures.

get you interested the article is well worth a read.

The StorageMojo take
pNFS is going to commoditize parallel data access. In 5 years we won’t know how we got along without it.

Parascale’s CTO on what’s different about Parascale

October 4th, 2007 by Robin Harris in Architecture, Clusters, Future Tech

Is Parascale new or old?
There were many good reader questions about Parascale’s announcement. Even though I’ve done some work for them I didn’t know the answers so I invited their CTO, Cameron Bahar, to respond. He sent me a text only email, which I’ve decorated with some HTML to improve readability.

CTO Cameron Bahar:

Hi Robin,

We are delighted by the interest shown in both the file management challenges that Parascale seeks to address…and in our newly-announced solution. Your readers bring up many important issues, especially in regards to how existing solutions compare to Parascale. Permit me to try to group these questions into categories and to highlight how Parascale is different.

HPC solutions High Performance Computing (HPC) solutions are typically implemented with kernel code and employ custom client-side software to achieve high bandwidth. For example, Lustre has been successful at many national labs as mentioned in one post. Parascale is targeting a different market. Parascale is all about industry standards. We support NFS, HTTP, and FTP protocols because we don’t expect our customers to recompile their applications. We want our software to be simple to use, as well as to scale in capacity and bandwidth for our target digital content applications.

Archival solutions. Several companies, including Archivas, have delivered archival systems. These solutions are generally WORM (write once read many) systems and disallow updates to existing files. By comparison, Parascale is POSIX-compliant and designed to support large read/write bandwidth—not always a requirement for archiving. Finally, if a large vendor has acquired these technologies (e.g. HDS-Archivas), they’re usually shipped as a rack of pre-installed appliances, limiting choice of hardware provider and hardware configuration.

Clustered file systems. Shared-disk clustered file systems such as Red Hat GFS have the characteristics of traditional distributed file systems such as tight cache coherency, distributed lock management, symmetric topology. Scalability of these file systems is generally limited to 16 or 32 nodes due to heavy cache coherency traffic and message passing between nodes.

Members of our engineering team have written several clustered file systems in previous undertakings. From that experience we elected to adopt a very different architecture for Parascale. For starters, we elected to adopt a loosely-coupled architecture for scalability. Further, we chose not to write a new file system. File systems are very delicate (as we know by having written them in the past) and they take 5-7 years to fully stabilize and stop corrupting data. We simply aggregate existing file systems to present a “virtual file system” layer to clients/applications over standard protocols.

Appliances versus software. NAS appliances are ideal for many markets, like SMBs and enterprise workgroups, that need simplicity of installation and for which scalability in volume and bandwidth are not key requirements. Appliances generally employ hardware highly-customized for serving files, including hardware features like NVRAM to boost write-performance and RAID controllers for data redundancy.

Parascale seeks to solve a different problem, that for management of large digital content repositories. Think of video on demand, photo archives, medical imaging, seismic data, and genomics data. Don’t fault us for being inappropriate as secondary storage for an RDBMS. We didn’t design Parascale for block storage because many excellent products already address this market.

We’ve constrained our solution to run as an application (with no kernel code) on industry-standard servers, as qualified only by Red Hat. We want our customers to enjoy the very latest advances in server hardware (motherboards, processors, memory, disks) available from Dell, HP and others. And we want our customers to be able to buy servers from their “regular hardware vendor”

Parascale’s software-only solution lets our customers to tune the disk capacity, CPU, RAM, I/O and network bandwidth independently—as required by the application at hand. Growth can be incremental—one disk drive or server at a time. You never have to discard hardware or licenses. Another useful benefit of a software-only solution is that other applications can coexist on the Parascale storage nodes, allowing data mining, trans-coding, encryption, or compression on the servers where the data resides. This is not possible with closed appliances.

What qualifies as “software-only” file storage solution? Our perspective is, first, that the software has to support standard network file access protocols like NFS, HTTP, or CIFS. You can store files in an RDBMS, but that doesn’t make it a software-only file management solution. Second, the disk drives must be direct-attached to the servers. Shared disk distributed or parallel filesystems (over SAN) are software products, but don’t qualify because they require specialized SAN hardware on the back end.

Finally, because all our engineering resources are focused on software, we’ve been able to innovate (with patents to prove it) and to deliver features like transparent, automated file migration (to eliminate server hot spots) and replication (to raise read bandwidth). And our roadmap promises a lot more innovation to follow!

Asked another way, where does Parascale fit in the market? Choose us if:

  • You want industry-standard hardware (e.g. because you want to run applications on the storage nodes, or because you have corporate hardware standards).
  • You need more bandwidth than one server/head can provide.
  • You need the benefits of data mobility across servers (e.g. migration to balance data and eliminate hot spots, replication to increase read bandwidth, smart load balancing to optimize system performance).

Lastly, Parascale aspires to be new and modern in its business model. When our product goes production, we plan to allow you to download our software to try it out at no cost. We’re confident you’ll like it. Our pricing is per-spindle, so you never have to deploy or pay for storage capacity before you need it. And if a drive fails, replace it with a new drive in the manufacturers’ current sweet-spot; we’re not trying to make money on advances by the disk drive manufacturers.

Hope I’ve addressed some of the questions posted. I applaud the thoughtful discussion that your post has prompted.

Best,

Cameron

Comments welcome, of course.



« Previous ArticleNext Article »
StorageMojo RSS Feed May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006