Web Search for a Planet: The Google Cluster Architecture - Google, 2013

This is an article by Luiz Andre Barroso, Jeff Dean and Urs Holzle on the search architecture at Google.

  • The most important factor for their design is price / performance ratio
    • Energy efficiency is part of the price
  • Two main insights:
    • Provide reliability through software instead of expensive hardware
      • I note that the expensive hardware is going to fail sometimes anyway, even if it’s a software or human error
      • Optimize for aggregate throughput, not peak response time
  • By using commodity PCs where computation is cheap, they can use more intensive ranking and a larger index
  • Several generations of servers are in use at the same time
  • They use the metric of cost per query to compare servers
    • This includes depreciation, hosting, administration, repairs, power, and cooling
  • The authors posit that the cost of organizing 1000 servers is not much more than 100 servers. Once you are at a certain scale, it’s pretty cheap to deal with more servers.
  • While they can pack a lot more processing power into a rack, conventional DCs can’t support the electricity and cooling power density required
    • They measure watts per unit of performance and not just watts
  • They noticed that they have a high CPI, even on Pentium 4 with deep pipelines and advanced branch predictors
    • The authors say this is because their application does not have a lot of Instruction-Level Parallelism
    • It’s better to pack many smaller cores on the same die
  • Their application does not benefit from temporal locality (they are scanning though a huge index)
    • It does benefit from spatial locality
    • They suggest that larger cache lines could really help them