Urs Hoelzle – A look behind the scenes at Google – liveblog

8:10AM – Google’s mission is much bigger than search. It is to make the world’s information available in a usable and accessible way. That’s a huge mission.

Urs is going to talk about Google scales its infrastructure.

Google uses commodity PCs, because you get more for your money. However, you need to figure out how to make them work together.

3 pictures on how the data centers have changed from 1998 (just random desktops), 1999 (bare motherboards in a rack, pretty randomly wired), 2000 (blades in racks), and now they’re 2U high.

They want to make each computer like an iMac, with only 2 cables: power and ethernet.

8:30AM – Google has so many computers that they figured out that each computer lasts on average 3 years and that if you have 1000 computers, that 1 will fail every day.

They solve the expected 1/1000 computer failures in an automated way. I’m a big believer in automation. Its better to make computers for you, rather than the other way around. Google solves this via software and with replication and redundancy. This solution also helps with serving capacity and performance.

Fault tolerant software makes cheap hardware practical.

He goes over how Google builds the Search index via the Page Rank. The index is split up into chunks called shards and then distributed among servers. There’s replication of shards between some of those servers. I think most of this is documented elsewhere, but I could be wrong.

Google started off doing it from scratch and then realized it could reuse some core components among its different types of webapps.

One of those is the Google File System (GFS). There’s a Master that manages metadata, which points to (typically) 64MB chunks, which are triplicated (is that a word?) across three Chunk Servers for reliability. Clients talk to the Chunk Servers directly (after consulting the Master) to transfer data. Hmm… I bet the masters are triplicated too. Note that the triplication also helps with throughput, so that if many people are accessing a blog, you can load balance them among the different Chunk Servers. There’s more info from a paper from SOSP 2003.

At this point, I’m wondering… I wonder if Virginia Tech does things the same way. They’re not using the cheapest hardware, but instead are using Apple XServe G5s. So I would expect those to be more reliable. That must affect their software strategy, but how much?

8:45AM – MapReduce is a parallel functional programming framework that they use at Google, which basically takes in key/value pairs, maps them, and then reduces them. It works for a certain class of problems. He showed a C++ example. That’s not too bad since we’ve got the CDT and many Java programmers have a C++ background, but it seems like that might get booed at JavaOne.

Scheduling happens with one Master, many Workers. Master assigns either a Map task or a Reduction task to each worker.

Urs comparing MapReduce to Eclipse: If you create a generally useful tool, you will find that there are many people that will adopt it.

You can read more about MapReduce at Google Labs.

9:00AM – Urs pulls out some amusing slides like the searches for “eclipse”, with spikes every time there is a solar (or maybe lunar) eclipse. Also another slide with the many ways that people try to type in “Britney Spears” while searching for it. (BTW, Google is a nice adhoc spell checker/dictionary.)

Google has a way to train computers to correlate words by putting many many workers over many CPU years at a massive amount of data to produce clusters. He showed a demo which showed us that eclipse is about 80% correlated with the car produced by Mitsubishi. (I wrote about this problem early on in this blog… “The Problem with Common Naming”.) It makes more sense now why Google Mail can be accessed via POP without having ads inserted, because the value to Google is not just the advertising, but the additional data that is already in context.

Q: How do you handle changes to a running system?
A: You don’t. Take a replica that is not being used out of service and use it for testing. Then if it is good, then repeat with more and more replicas.

Q: How long does it take to index the web?
A: Its complicated because visiting each site every day would actually be bad because of bandwidth usage (even though they could). This is nice to hear that they are being considerate of bandwidth. Brent Simmons is the same way and has designed NetNewsWire appropriately.