Skip to content

Instantly share code, notes, and snippets.

@jasobrown
Last active August 29, 2015 14:07
Show Gist options
  • Save jasobrown/de810fced4f0ed60ad8c to your computer and use it in GitHub Desktop.
Save jasobrown/de810fced4f0ed60ad8c to your computer and use it in GitHub Desktop.
gossip topics with João

Topics and questions for discussion with João and Jordan

topics

  • peer sampling service vs. gossip service
    • how does this map to cassandra? riak?
  • where does a failure detection system fall within a gossip system?

random questions

  • when João is thinking about 10k node systems, what does he imagine the puspose of that cluster to be and why using gossip? a P2P network, like bit torrent?
  • I've noticed gossip papers tend to paper over the state of a peer, it's either UP or DOWN. There are several more states we have to deal with as practitioners, in cases such as removing a a node from a cluster and when it can be safely expunged by all other peers.
    • in c*, we leave the state of a decom'ed node as LEFT for three days, with an explicit wall clock tmeout that all peers should obey. Then we quarantine the decom'ed node for ~30 seconds, then all peers should completely forget about. The quarantine is very short to account for a node that restart within the quarantine time does not remember the gone node after the timeout.
  • peersim. how useful is it? initial look at the code (which seems rarely updated), looks a little rusty.

João's papers:

  • HyParView
  • Plumtree
  • Thicket
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment