I survived Seattle’s “summer” weather
And the Google-sponsored Seattle Conference on Scalability. It was like spending 10 hours trying to drink from a fire hose. Great stuff.
I took notes on four of the sessions I attended. I would have taken more, but since Apple hasn’t shipped a notebook with a ten hour battery life I had to stop to recharge. It’s been so long since I wrote anything by hand that I can’t even read my handwriting any more.
This is a highly idiosyncratic account of the conference: I’m just talking about what i found interesting. Fortunately Google video’d the event and will put it up on YouTube. When I get the URL I’ll update this post.
Jeff Dean, senior architect at Google
Jeff is the architect of virtually every large scale system at Google. He kicked off the event with a key note on scalability at Google. As I suspected, Google is looking for new ideas on scaling another 100x over the next few years. That would mean clusters of 500,000 to over 800,000 nodes – or at least cores.
Jeff noted that BigTable, Google’s storage system that runs on top of GFS has about 500 cells, the largest of which is up to 3000 terabytes of data.
The benefits of massive scale
Jeff talked about the impact of scale on machine translation, which is a major effort inside Google. The goal is to enable a someone to ask a question in Urdu and to get access to relevant documents no matter what language they are written in through machine translation of their query into many languages with machine translation back into Urdu.
The translation model is probabilistic rather than dictionary-based, so the more examples the system has to work with the better the translation. The MT team has found that translation accuracy increases 0.5% with each doubling of the training content. That means a *lot* of storage.
And a lot of I/O: over a million lookups per second. A lot of that is cached and it is still a lot of data.
Today’s Google rack
Jeff showed a picture of the current Google datacenter rack, which appeared to consist of 20 mobo’s, each with two dual-core Intel processors for a total of 80 cores per rack. There is a 4U gap in the middle of the rack, which I assume has the DC power distribution unit. It looked very neat and tidy, unlike the pictures of Google’s early racks.
I’ve meant to write about MapReduce, but I couldn’t quite get a handle on it. Jeff spent a fair amount time describing the advantages of MarReduce, so now I have that handle.
MapReduce is essentially a programming language that abstracts the messy details of programming a large cluster. The Map piece extracts the data that one wants to work on into a essentially a big spreadsheet or table, while the Reduce piece massages the data into the final form. With this tool a program of 50 lines can put thousands of compute nodes to work.
Google’s scalability challenges
Google is pretty happy with their tools, but it is American to want something better. And what they’d like is a single global namespace so that data can be accessed from anywhere. So the scalability number I offered at the beginning of this post may be way low. Instead of scaling a single cluster 100x, Google would actually like to scale and interconnect their entire cluster population – which I estimate is now over 4 million cores – 100x.
The StorageMojo take
Wow! More tomorrow as I continue the report on the conference.
Comments welcome, as always.
Global namespace. I have wanted to work on this since I first heard about it in1997. I’m sure it has been around much longer than that. I was with Exxon at the time, working on a joint internal Backup/Archive with SGI. The SGI OpenVault standard was being developed at the time and what we at Exxon were doing looked like a good candidate for OpenVault. Didn’t happen but global namespace did come up. The Exxon guys didn’t think it was important and the SGI guys thought it was very important. I agreed with the SGI guys.
Create an Information entity/object once and it lives forever in the global namespace. It may not be online, nearline, offline or even in the distant archives but its name lives. The SOA system for this global namespace has to search for the Information and deliver the result to the requestor. Hopefully this will be the Information or a Context cluster for “fuzzy” requests. A human may remove the name of no longer existing Information from the global namespace but any request for that Information returns the removal Information. Immortality!
To do this the SOA system must scale resources as determined by the SLA for the Information requested. If the Information can no longer be found anywhere, that fact is noted for future reference and that Information is returned to the requestor. Takes quite a system to do this.
An obvious issue is the searching of the global namespace every time for every request. Then there are Security issues. Do you have the authority to access that Information?
I made the post here about “pace layering” as a new way to look at Information.
To me, it appears that Google has one pace and one layer in their search engine.
I wonder if this is true?
Amazon appear to have a multi-layered approach. SOA lends great strength to the multi-layered approach through the aggregation and then the integration of those resources for task completion.
Think “stateless”. I have loved “stateless” since I first heard about it. Everything on the fly. Shoot from the hip. Space age stuff.
We really should give credit to the wonderful engineers who have designed and built Storage devices in all media over the years. This Unit of Technology precision is what gives us the freedom to go “stateless” and “shoot from the hip” with accuracy. I can recall spending more time on the “error correction” system in a Storage device to make sure it was written correctly and “in the place it said it was written” than retrieving and using that Information.
You mentioned you were going to meet with Isilon reps.
Did you and how did that go?