PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs - CMU, 2012

  • [Paper] [Mirror]
  • [OSDI’12] [Video]

  • Partitioning a natural graph is hard.
    • Random partitioning of vertices leads to most edges crossing machine boundaries.
    • This means that there is a ton of communication between machines.
  • GAS Decomposition - They recognized most graph algos had a similar steps.
    • Gather: Accumulate information about the neighborhood
    • Apply: Apply the accumulated value to center vertex
    • Scatter: Update adjacent edges and vertices
  • Replicate the vertex across the nodes such that no edge crosses a machine boundary.
    • Now we can run the vertex program in parallel, only needing to synchronize at the apply step.