Grid is dead

by Robin Harris | Monday, November 26, 2007 | Architecture, Clusters | 13 comments

Was it ever alive?
Techies have been excited about grid computing for years. The rest of the world never caught on. They probably never will.

Who killed grid?
A slide from Sun’s recent HPC meeting at SC’07 implicated marketing:

The Grid is Dead…

Term falling into discredit

Misuse by Corporate marketing departments, including multiple contradictory deï¬nitions

Failing to live up to expectations

Built for speciï¬c community

Unscalable

Expansion efforts only make the situation more unwieldy

The death of grid had 2 causes: a) nobody could agree on what a grid was and b) they were too specialized and didn’t scale.

Yup, marketing killed the grid. If only they’d agreed on a single contradictory definition.

Simple is better
Techies seem to have a mania for inventing a new term when a perfectly good term already exists. In the case of “grid” the term “cluster” would have served admirably.

Among aficionados, a grid is not much like a cluster. To the folks who approve capital authorizations it is hard to tell the difference. Except grids are new and unproven while clusters have been around for decades.

Polite laughter
At IDEMA there was some back and forth between Seagate and other vendors over what to call the recording method that uses lasers to heat the media. Seagate researchers were pushing the catchy “HAMR” – pronounced hammer – an acronym for Heat Assisted Magnetic Recording. Everyone else seemed to like “TAR” for Thermal Assisted Recording.

At one point a senior researcher said he’d been talking to a civilian who asked about the “hummer” technology. Most of the assembled laughed and took this as evidence that HAMR was a bad acronym.

In truth, it is simply evidence that a research PhD does not a marketer make. Disk drives suffer from a dearth of sex-appeal, especially against cool newcomers like NAND flash. HAMR is a much sexier acronym than TAR. And a much better acronym to help keep consumers interested in disk drives.

You know, consumers. The people who buy most of the world’s disk drives.

What if RAID had been SMAD?
Would a Spare Matrix of Autonomous Disks have taken the storage industry by storm the way RAID did? No way. People, even techie people, are social creatures. We use language – even jargon – to signal to others our values and knowledge.

People like RAID so much that even after the original words ceased to have commercial meaning – all disks are inexpensive today – they just changed the word to “independent”. All to keep a cool acronym.

“God is dead.” -Nietzsche. “Nietzsche is dead.” -God.
The Sun presentation’s point was that a new middleware for grid computing was needed so users didn’t need to worry about where and how their jobs would get done. Just ship it up to that big batch mainframe in the cloud, sip your Red Bull and Jack, and let thy work be done.

The StorageMojo take
Product names aren’t all that important. But paradigm-shifting names, like RAID, client-server, clusters, NAS, pNFS and SAN are. Which is the real reason “grid” is dead.

Grid just isn’t suggestive of anything to people who aren’t intimately involved with it already. RAID and HAMR suggests something forceful while TAR sounds like a place I don’t want to go.

So yes, bad marketing killed grid computing. They should have asked the guy who came up with RAID.

Update: The technique of lashing together lots of boxes to perform work isn’t dead. We’re at the beginning of that trend. What is dead is the term “grid”. Its unfortunate connotation of rigid structure and the concomitant “gridlock” don’t do the technology justice. “Cluster” whether hetero- or homogenous, local or wide area, storage, compute or both, is a much better understood term.

Update II: Reading the comments in The Blog of Scott Aaronson, a postdoc at the Institute for Quantum Computation at the University of Waterloo, about the much disputed D-Wave quantum computing demonstrations, he commented:

I think that computer scientists have been much too eager to coin acronyms, and that this has damaged the public perception of our field relative to physics (which has cool names like â€œquarkâ€, â€œsupersymmetryâ€, â€œblack holeâ€â€¦) . . . .

He’s right. If you had to convince yawning non-scientists to fund billions of dollars in sub-atomic physics research you’d probably develop better marketing muscles too.

Comments welcome, of course. If you don’t like cluster, what would you call grid computing?

13 Comments

Tom Burns on Monday, 26 November, 2007 at 5:59 pm

How about RAIC == Redundant Array of Inexpensive Computers (pronounced rake)
Robert Chien on Tuesday, 27 November, 2007 at 12:06 am

Doesn’t compute grid live on in the form of botnet, the kind that generates spam?
ian Osborne on Tuesday, 27 November, 2007 at 2:16 am

Nice try but grid is far from dead. The concepts of sharing and dynamic distribution of work across a heterogeneous is fast becoming the mode of work for leading organisations. Virtualisation has embodied these concepts and the big web service providers, Google, Amazon, E-bay depend upon these concepts. Albeit in home grown form. The UK academic world has a national grid service used for research and a myriad of campus grids. The financial sector has several organisations running enterprise grids. So grid is far from dead, its empowering a new generation of service oriented infrastructures.
Wes Felter on Tuesday, 27 November, 2007 at 12:40 pm

I thought PC vendors bought most of the worldâ€™s disk drives. When I buy a computer, the system vendor doesn’t even tell me what vendor the disk is from; forget about HAMR or other features.
MartinsK on Wednesday, 28 November, 2007 at 12:12 am

An to add there is a strong GRID movement in Europe also (and it gets EU funding too) and Baltic states have made BalticGrid. It is more academic, but this should be as usual – at the beginning it is a scientific project and business catches up. I think grid still could survive.
Vic on Wednesday, 28 November, 2007 at 3:45 am

How about SWARM, Single Wide Array of Resource Modules?
Serge P. Nekoval on Wednesday, 28 November, 2007 at 7:14 am

>So grid is far from dead, its empowering a new generation of service oriented infrastructures.
Yeah yeah. Since no one can really tell you what is “grid” and “service oriented”, these terms seem to be applied practically to everything.
M.S. on Wednesday, 28 November, 2007 at 7:14 am

For a long time I am watching what is being done in the scientific area of grid-computing, I haven’t seen any project that has matured into a state that it went productive. Currently there are plenty of these research programs running in the EU alone and I doubt any project will went productive in the next years.
In my opinion the whole grid-community somewhat has failed. I can remember that the globus toolkit didn’t really scale beyond 47 nodes (although it’s been a long time since these results were obtained — i think it was globus toolkit version 2). Maybe (and I believe so) the globus toolkit is in a better condition now. Nevertheless I cannot remember of any grid result getting into business’s use…

I agree that from a marketing perspective the term “grid” is dead – who, that is sane, would ever call a 2-node “cluster” a grid (although oracle does)?!

Nevertheless I believe the grid idea will live on by the term “cloud computing” that recently emerged (AmazonWS, IBM’s Blue Cloud, etc.): “The grid is dead, long live cloud computing”.
Liam Newcombe on Wednesday, 28 November, 2007 at 11:36 am

The term ‘Cluster’ has unfortunate connotations and would need to be at least qualified to describe grid. For years the term cluster has been used to mean ‘monolithic 1+1 cluster with expensive shared disks used to prop up an application that has no inherent high availability’ these clusters were expensive, fragile, prone to common cause failures, difficult and expensive to maintain and only in use because the IT industry was locked into legacy monolithic applications. Where basic grid is fundamentally different is that the applications themselves have to be written competently, with the understanding that they will be running in several locations and the ability to partition and replicate state and data as necessary. This removes the requirement to compromise your ‘high availability’ system with fault propagation paths such as hardware replicating disks (synchronous remote data failure capability) and OS level cluster. Trying to prop up legacy, monolithic applications with expensive tin and complex underpinnings was never a long term solution.
Perhaps we should qualify the terms as;
1) Cluster (Legacy or Monolithic) to indicate flawed and overpriced
2) Cluster (Grid or Distributes) to indicate useful cluster technology
xfer_rdy on Wednesday, 28 November, 2007 at 12:34 pm

The other day, I was looking at another Zdnet article (Dan Farber) on Amazon’s EC2 leasing 1.7Ghz CPUs for $0.10 per CPU hour. Link: http://blogs.zdnet.com/BTL/?p=3540 There was also a David Berlind (also Zdnet) video presentation, which I can’t find now, about how there was no real CPU executing your services, all virtual.. No CPU interesting, it must be magic… Why can’t I get a job to say stuff like that ?

Today, there are few storage manufactures. Unless there is a shift in markets, there won’t be any small storage manufactures at all. The few small storage companies are learning that have to be vertical application providers, which almost makes them vars and hopefully acquisition opportunities by larger storage companies.

We are seeing the same model emerge in the “for lease” CPU/web services. Most companies that believe they have a need for high availibility computing services probably too small to have the capacity to build and maintain their own grid platforms. The academic and scientific grid platforms still require a high degree of technical expertice to integrate, operate and maintain the systems.

Efforts deploying “grids” in the SMB market has probably been too early in the market’s technology life cycle and the maturity of grid software. Cost effective blade technology is just emerging this last year for these markets. GigE and SAS interconnects have really helped in this area. Once hardware platforms that can support grid software are deployed, we should expect software to migrate in 3-5 years.

One company to watch in this area is VMware. They have a basic software product that can manage service execution “images” and push them between CPUs. They are the real advocates for moving grid concepts and software technologies into the SMB markets. Now, does VMware product follow the current definition of grid computing defined by the academic and scientific communities, like globus ? No, but that’s not what the market needs today. The market needs easy– compatibility with applications without having to rewrite them, operating on currently deployed hardware, working on conventional networks.

Is the term “grid” dead ?? Only from the media’s perspective. On the deployment front, this could be the calm before the storm. If the transition to “grid” is managed correctly, we shouldn’t even notice its happened.
xfer_rdy on Wednesday, 28 November, 2007 at 1:08 pm

I hit submit before I finished….

On the storage end of this business, you are correct about the the storage industry killing innovation in computing platforms. Both computer and storage systems designers, I believe, have lost sight of the fundamental objectives for designing systems.

I really think storage marketeers need to be careful about what they are telling the market, before someone calls “bullshit” again. It happened with CERN’s bit error rate study, it happened with iSCSI, now it will happen with RAID or whatever they are calling it.

People in the market are getting smarter. They know that spare pools of disks are the same whether you call them “raid spares” or smad. They figured out quickly that iSCSI was good for long haul and a miserable solution for high utilization applications. CERN published their data integrity is awful.

The real question is… when is the media going to admit storage is dead ?
Bill Todd on Thursday, 29 November, 2007 at 2:30 am

Liam –

VMS invented the term ‘cluster’ (and to a significant degree the underlying concept as well) in 1983, and it never remotely resembled the description which you present above. By the mid-’80s it nominally supported up to 96 cooperating nodes (though considerably larger systems were created in practice), mirroring between nodes using local storage if you didn’t care to use directly-shared storage, and extremely robust operation with *no* single points of failure (clusters have functioned well across distances of up to 500 miles, and test clusters have functioned – albeit with the expected speed-of-light limitations – across geosynchronous satellite links). In other words, VMS has for over two decades supported considerably richer facilities than the current GRID approach does in a manner that extends the single-host paradigm to distributed systems, thus allowing both naive applications designed for single-node operation and more sophisticated cluster-aware applications to coordinate their distributed instances *without* having to worry about their own execution location, or about the location and replication of data, or about coordinating distributed shared access to it, or about creating their own distributed ‘liveness’ protocols (all of which can be handled as they would be among multiple cooperating applications executing on a single host node).

IBM mainframes acquired similar facilities in the early ’90s – they called it ‘Parallel Sysplex’ rather than ‘clusters’. At about the same time they began incorporating more limited clustering facilities into AIX (under the ‘HACMP’ acronym).

Only the lagging remainder of the Unix market and Microsoft have ever tried to pass off simple server-mirroring (which Novell pioneered for lower-end use back in the early ’90s) as a ‘cluster’, and even they now offer larger and more flexible configurations.

So a) the term ‘cluster’ has considerably richer connotations than you appear to be aware of and b) in actual use clusters offer far better support for distributed applications than the ‘GRID’ architecture as you describe it (because it usually makes a great deal more sense to implement common facilities of any real significance in the operating system where everyone can benefit from them than to make applications roll their own). There may, of course, also be room for an additional more loosely-connected configuration (called a GRID, or whatever) for environments with even higher scaling requirements – and in fact HPC environments already implement several such, with continuing development in the open-source community (though whether there’s much *commercial* need in this area is yet TBD). There’s also room for specific applications such as immense storage farms, but their needs are pretty specialized (such that only the most basic ‘GRID’ mechanisms would likely be usable).

– bill
Brian Smith on Tuesday, 28 December, 2010 at 6:09 am

I first encountered GRID computing when I came across a Cancer Research project which was itself modelled on the SETI methodology.

This involved the coming together of individual computer owners (with the necessary on-line capabilities) who donated their “spare” computing capacity to cuckoo type programs which were downloaded to their (typically) PCs and which ran whenever the PC was on but not doing anything.

New versions and data were distributed to the membership and the results collected on a periodic basis. It is worth noting that several references to the project use the term “distributed computing” which might seem more appropriate than GRID.

United Devices took this self-tested model and tried to commercialise it and, as far as I know, so did IBM under the marketing term GRID computing. I think it failed?

The Cancer project has ended – see here for more details http://www.chem.ox.ac.uk/curecancer.html – while the SETI project is still going.

It would be interesting to compare the economics of recruiting and managing a volunteer based network, and all its variables and essential redundancy, with a purposed investment in a specific system based on the now available “cloud” networks. The first purports to be “free” while the second has very definite costs but delivers a predictable performance in a specifiable time frame.