A reader wrote me a note that asks a question that I think is on the minds of many data center folks. He said it well himself, so I’ll quote liberally, starting with the compliment.
I really enjoy reading your blogs!
One thing I’ve notices in your blog, other blogs, and all over the computer media is how Google keeps coming up as an example. It’s how Google leverages commodity servers, drives, custom software (Google filesystem), etc, etc. It’s as if everyone thinks Google should be emulated.
Ok . . .I can buy that . . . but I wonder, how much of what Google does is really applicable to my work. I work for a utility. The vast majority of our IS computer resources go toward transaction processing type systems.
What you _NEVER_ read about Google is what computer resources they use for their internal business processes. You only hear about what resources they use for their products (search, gmail, maps, etc). When I read about how wonderful the Google infrastructure is and how it should be emulated, I always start wondering if the described infrastructure (commodity server, cheap disk, google filesystem and massive parallelism) are also used for their billing, payroll, accounting systems, and whatever? Do they use custom written software for these, or Oracle apps or SAP? Do they use a database, which one? What disk systems do they use with it? How is it laid out?
It always interesting to hear how Google does things, expecially since they are so secret about it, but I’m not convinced that what does come out is useful to us, or is a very complete picture of their infrastructure. I guess I wonder if buried deep in their datacenters is a more normal infrastructure like ours.
Good questions – one’s I’ve often asked myself.
Google the 10,000 person company doesn’t have nearly the clout that Google, the world’s largest internet advertising company and, more importantly, buyer of 500,000 servers a year, does. If they asked IBM to clusterize MVS to win an order they’d get laughed at just like anyone else.
But dangle a few hundred million a year in front of Intel while asking them to do things their engineers want to do anyway and that is much warmer.
I suspect that their offices and internal data centers look a lot like yours, at least for the database business apps – the corporate underwear. But I bet they back up their unstructured data on GFS – why not?
Linux, PCs and Macs
I know they use Macs and PCs and that, at the very least, they outsource some of their IT work to people using Microsoft server products. They may even have Microsoft servers inside the company, though I’ve never seen evidence of it.
However, I have never held up Google’s infrastructure as one that could be used to count money. Check out the StorageMojo take on the Google File System and I said as much.
Amazon is a different story
The more appropriate example is Amazon. They have millions of customers, they count billions of dollars, they customize each web page on the fly and they do it with a services-based distributed architecture based on open source software clusters. They scale well. And they arrived at that architecture only after trying all the “enterprise” products, including a mainframe. They not only built it, they migrated to it from a very large installed base.
If Werner Vogels ever decides to build his own company, that would be the pitch.
Amazon does transaction processing on a cluster. That is the enterprise problem.
Amazon is the company IT architects should be studying. They just don’t publish very much.
Enterprise forever
I don’t believe that “enterprise” hardware and software are going away in my lifetime, any more than the mainframe has or probably will. What will shift is the growth. When the market shifts, the weaker players will fold or consolidate, just as they did in the mainframe market.
But with 85%+ of digital data in ordinary files, even mid-range RAID solutions are overkill. Big blobs of cheap cluster storage would solve all kinds of IT problems. Back up window closing fast? Back up to a storage cluster sized to be a 6-10 week FIFO buffer. I suspect there are many data center applications for cheap cluster storage today if someone offered a reasonable product and notoriously conservative IT managers tried them.
Enterprise growth rate
Moore’s Law is driving up CPU power faster than enterprise application growth rates. The enterprise market share has been shrinking for years, and in the next five years that market’s growth could stall entirely.
The StorageMojo take
Google is a fun story, the way Microsoft was in the 1980’s. They picked up a lot of ideas that folks had been working on for, in some cases, decades and rolled them out in a big way. They’ve produced something we’d never seen before even though much of it was percolating around CompSci departments for years. The antics of the boy billionaires makes good copy.
The real power of Google will be seen when the computer scientists who are now multi-millionaires get tired of working for a big company and decide to see if lightning can strike twice. For most of them it won’t, but what the hey, they didn’t go into for the money anyway. They’re the hot rodders of the digital age, channeling, chopping, stroking and boring the bits to create beauty, handling and speed. With luck, all three.
Comments welcome, of course. Have a good weekend!
I read an interesting interview with Google’s CEO Eric Schmidt in Wired that answers some of this question- either way, it’s a good read. http://blog.wired.com/business/2007/04/my_other_interv.html, text follows:
“When you got here in 2000 it was a relatively new company run by people (Larry Page and Sergey Brin) who were at that point probably 26 or 28 years old. How did you convince them that this (inviting other executives to Google, soliciting their management advice, and installing a more systematic approach to running the company) was a good idea?
I don’t know. When I came here, I came because I liked Larry and Sergey. It was an interesting little company. I had no idea it would be successful. It was not broken. But we needed leadership.
For example, we had an accounting system which was an Intuit based system designed for five users, and they were using it for 20 people. It was too slow to use, so I suggested that they implement an Oracle system. It was a huge crisis. We ended up spending $100,000 for this. Larry and Sergey nearly had a cow over it (because they thought it was so expensive). A hundred thousand dollars is the cheapest Oracle system ever implemented in history I think.
How did you convince them that you needed to do this?
Well, it was actually very interesting. Larry and Sergey suggested that we should build our own, because most of the existing accounting systems weren’t any good. And I said, “I’m sure that’s true, but you’ll never get it audited,” And I thought that was a pretty clever argument. The auditors would never pass financials (generated out of software) that we built ourselves. And Larry and Sergey today will complain about the Oracle system, but they’ll also say “We had to get one that was auditable.”
I’ve always said that if the company were founded today on an empty lot, we would build the buildings brick by brick. We can’t imagine someone else building our buildings, we’d have to build it ourselves. This is a build-it-yourself culture. The good news is there’s no free land, and so we have to rent the buildings, rather than build them. But the culture is around building things. In that sense, by the way, it’s similar to some of the companies (Intel, Dell, Sony) that I mentioned earlier.”
If you are conspiracy-minded, you might even wonder how much of the information coming out of Google is propaganda designed to lure competitors into an infinitely scalable build-it-yourself morass that looks great at warehouse-scale but actually costs more at small scale.
Wes,
How would that fit with “don’t be evil. Bwa-ha-ha-ha?”
I think the opportunity is for companies to offer products that enable the cluster computing and storage model at much smaller scale. Nearline storage clusters could be huge for backup and FIFO buffers for data on its way to archive. Is that what IBM Almaden is doing with the spin-out?
Robin
Maybe I’m just cynical. I’d like to see systems that scale up and down, but fundamentally, scalable systems require extra engineering compared to non-scalable systems and that cost will usually be passed on to the customer. Just two recent examples: EC2 doesn’t scale down because it can’t run off-the-shelf software stacks (see http://news.ycombinator.com/item?id=37720) and Isilon doesn’t scale down because it appears to cost 2X as much per TB as Thumper. I imagine that Seval will also be charging a premium for their value-add, whatever it is. Also, non-scalable systems are getting bigger (e.g. 1Us now have 8 cores and 16GB RAM, Thumper is 24TB and probably soon 48TB) and thus pushing scalable systems up-market.
If there is hope, maybe it is in BigCo-subsidized open source like OCFS2, Lustre, and Solaris pNFS.