Skip to content

Instantly share code, notes, and snippets.

@jki127
Last active November 6, 2018 22:15
Show Gist options
  • Save jki127/0ded48eb6c5f8f53f2caa22ffb78471e to your computer and use it in GitHub Desktop.
Save jki127/0ded48eb6c5f8f53f2caa22ffb78471e to your computer and use it in GitHub Desktop.

Failure & Serial Equivalence - DistSys Lecture - Oct 18th, 2018

Heartbeat works with two nodes, but what if there are more nodes?

Detecting Failures in a Distributed FS

Strategies for Detecting Failures

All-to-all

Check each node to see if they’re alive

  • Problem: When every nodes does this, there is a n^2 communications

Centralized

One node keeps track of every other node’s status

Ring

Each node checks the node adjacent to it

Serial Equivalence

Bank account code example

program1

a.withdraw(100)
b.deposit(100)

program2

total = a.getBalance()
total += b.getBalance()
total += c.getBalance()

If these programs are running concurrently, there are many ways the total can be calculated wrong

ACID

  • Atomicity
  • Correct (usually called Consistency in databases)
  • Isolation
  • Durability

https://cse.buffalo.edu/~stevko/courses/cse486/spring13/lectures/21-concurrency1.pdf

  • when we’re drawing lines between conflicting operations like above, we don’t want the lines to cross

What if you wanted to run parts of transactions on different computers?

begin Transaction
a.withdraw(

commit

2-phase commit (2pc)

  • Send pings to all servers and wait for ACKs
  • If we receive all ACKs then commit the transactions

Coordinator (client) talking to Cohort (servers)

-> CanCommit?
<- Vote (y/n?)
-> Commit or Abort

Terminology

  • 2pc is safe because either everyone commits or everyone aborts in most cases
  • 2pc is not live because you can end up waiting forever for a response

3-phase commit (3pc)

  • not safe
  • live
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment