A particularly odd bit of goofiness has hit the infosphere: cloud/utility computing mania. Nick Carr has written a book – a sign of the Apocalypse . IBM has announced, for the umpteenth time, a variation on utility computing, now cloud computing. Somebody at Sun is claiming they’ll get rid of all their data centers by 2015.
R-i-i-i-ght.
You know the flying car in your garage?
The syllogism is
- Google-style web-scale computing is really cheap
- Networks are cheap and getting cheaper fast
- Therefore we’re going to use really cheap computing over really cheap networks Real Soon Now
Can you spot the fallacies?
Fallacy #1: Google is Magick
The world’s largest Internet advertising agency does have the cheapest compute cycles and storage (see my StorageMojo article Killing With Kindness: Death By Big Iron for a comparison of Yahoo and Google’s computing costs). But they do nothing that the average enterprise data center couldn’t do if active cluster storage were productized.
Google built their infrastructure because they couldn’t buy it. They couldn’t buy it because no one had built it. But all Google did was package up ideas that academics had been working on, sometimes for decades. Google even hired many of the researchers to build the production systems. Happy multi-millionaire academics today!
Blame vendor marketing myopia for missing that opportunity. But their eyes are wide open now. If your enterprise wants cluster computes or storage you can buy it.
Fallacy #2: Networks are cheap
Or they will be Real Soon Now.
10 Mbit Ethernet from Intel, DEC and Xerox came out in 1983. A mere 25 years later we have 1000x Ethernet – 10 GigE – starting down the cost curve.
About the same time a first generation 5 MB Seagate disk cost $800. Today a 200,000x disk – 1 TB – costs 300 vastly cheaper dollars.
Also in 1983 the “hot box” – the VAX 11-780 – with a 5 MHz 32-bit processor and a honking 13.3 MByte/sec internal bus cost – for you, a special price – $150,000. Today a 64-bit, 3 GHz quad-core server – with specs too fabulous to compare – is $1300. Call it 1,000,000x.
Networks are the laggards. Which is why Cisco commands such a premium over the folks who do their jobs so much better: networks are the bottleneck. Optimizing the bottleneck has an incredible payback.
Hey, Cisco! Get the lead out!
What’s really going on?
There are – currently – economies of scale, which Google is exploiting and MSN and Yahoo! aren’t. So the latter two are going out of business.
But when you look at the cost of going across the network compared to the rest of infrastructure you realize that local – what we used to call distributed – computing is the only way to go.
Ergo, cloud computing will remain in the clouds and real computing will remain local.
Comments welcome, as always.
For once, I disagree with you. See my rebuttal:
http://www.daniel-lemire.com/blog/archives/2008/01/28/the-network-is-the-bottleneck/
And what about the security implications of “cloud computing?”
Security has been an afterthought in so much of technology history, it’s embarrassing. We’ve got computer viruses on picture frames now for god’s sake! When we can make appliances that are virus-free, then I’ll trust my data the the clouds.
Cloud computing? I can’t get the visions of Skynet and the terminator out of my head.
Cisco is taking a step announcing a 15 Terabit backplane switch that will support over 100 Gigabit ethernet. But, it’s going to be a long time before the world goes to a 100G standard. http://www.cisco.com/en/US/products/ps9402/index.html
Also none of this is really new even conceptually. The grid folks have been pushing the ‘compute-anywhere’ vision for years.
The issues that limit grid aren’t being solved by calling it ‘cloud’. Data is valuable – companies need to control where it is, who gets access. Algorithms being executed can also be extremely valuable and are trade secrets in many industries. Moving data is slow and expensive and the trend isn’t for this to improve (data sets are doubling per year in many industries). However CPU cycles are cheap and getting cheaper.
Google is a very special case with many advantages when it comes to ‘cloud’ type computing – however even they are secretive about algorithms used, and I doubt they’d be interested in letting their computes be shared by just anyone.
I’m always a bit concerned when cost-curves for different product groups are combined as it depends on what factors you choose. Yes the cost per MB of disk storage has reduced several orders of magnitude faster than networks (although comparing a 10Gbe switched network vs a 10Mbps shared bus topology isn’t quite right; there’s far more that 1,000 x the data carrying capacity in a 10Gbe switch than a (what was very expensive) coax LAN.
However, if we look at a different performance metric – IOPs and MBps on the hard disk market then you get a very different picture. Today’s 15K hard drive can do maybe 180 random IOs per second and perhaps 80MBps. Those are respectively around 5 and 100 times better than what was available 25 years ago. Given that many applications are limited by storage performance this is a real issue.
So when you make you comparison, be careful over the choice of metrics. You get wholly different answers depending on what you chose. Many of the reasons are governed by physics and geometry. Disk storage capacity increases to the square of density (as, famously does semi conductor capacity). However, sequential read speed only goes up linearly with bit density when constrained by the physical limits of moving parts. What governs basic network speed for a single serial links will be constrained by all sorts of things including transistor switching speeds (which have increased by maybe 1,000 in the 25 years). That’s another of those linear relationships.
So, square relationships (storage capacity, RAM, aggregate switching capacity etc.) proceed at a different rate to linear relationships (like sequential read speeds) with mechanical ones far behind those two.
You make very good points, some I agree with in my post:
http://www.qrimp.com/blog/blog.The-Open-Cloud—-the-future-of-cloud-computing.html
The important issue right now though, is that there are myriad applications for cloud computing where the network latency will be an issue regardless of the hosting environment. What Cloud computing offers over traditional single server hosting is the scalability for multitenant apps. You don’t get that with colo, shared server, or managed hosting.
And that is why it isn’t hype — it’s awesome.
Always like to see info on Cloud Computing! Looks like Australians are starting to wake up to it now with Telstra announcing a $500m spend this week on cloud computing services.