Raft: In Search of an Understandable Consensus Algorithm - Stanford, 2014

  • [Paper] as it appears in ATC’14 [Mirror]
  • [Extended Paper] with notes on implementation
  • [ATC’14] [Video]
  • [Notes] by MIT’s Distributed Systems Reading Group with some interesting thoughts
  • Here is a great visualization / tutorial on the basics of Raft
  • The goal of Raft is to easy to understand (or at least easier than Paxos)
    • I did find it to be pretty easy to understand
  • They note that replicated state machines are implemented with a replicated log
  • Raft is based around a replicated log with leader election
  • Leader election is controlled by the epoch (they call it term) and the log length.
  • Each server votes for no more than one candidate per epoch (term)
  • Split votes are mitigated using a randomized timeout to start an election
    • i.e. the duration a member waits before assuming the leader is unavailable is randomly chosen
    • if a split vote does occur than the participants wait for their individual (re-randomzied) timeouts
    • there is no increasing back-off as in TCP
  • Notes from MIT’s DSRG:
    • One of the members wrote an implementation and said it “just worked”
    • Once you add all the features you need for a production system, they feel it is approximately the same complexity as Paxos