The policy is keep-majority, new instances are started one by one, on redeploy newest instances are downed first (YOUNGEST_FIRST)
- 3 nodes up:
N1,N2,N3
- a redeploy is started, so a new node is spun (
N+) N+is acked by all the nodes, so the Leader (L) is about to set the node to UP- split brain separates in 2 partitions of the same size
P1= {N1,N2},P2= {N3,N+`}
What happens depends on which net partition holds the oldest node (OLD) and the leader (L)
Case # | OldestIn | LeaderIn
1 | P1 | P1
2 | P1 | P2
3 | P2 | P1
4 | P2 | P2
- P1 eventually sees N+ because
Lis on P1, it sees that there's a tie but decides to live becauseOLDis on P1 too - P2 will not see N+ as UP so it will decide to shutdown, being minority
- P1 will not see N+ because
Lis on P2, so it will keep living because he's majority - P2 will see N+ as UP and think we have a tie, but
Lis here so it will decide to shutdown, because it knowsOLDis on P1
- P1 eventually sees N+ because
Lis on P1, it sees that there's a tie but decides to shutdown becauseOLDis on P2 - P2 will not see N+ as UP so it will decide to shutdown, being minority
- P1 will not see N+ because
Lis on P2, so it will keep living because he's majority - P2 will eventually see N+ as UP and think we have a tie, but
Lis here so it will decide to survive, because it knows thatOLDtoo is on P2
@ivanopagano I think your description is correct, but just for completeness .
In case of equally distributed partitions, the one that will survive is the one with the lowest address, not the oldest. Which is surprising for me, because:
On a side note:
Since we want to avoid moving singletons as mush as possible, the partition holding the oldest one should be the one to survive, IMO.
Moreover, the oldest is a stable information, while the one with lowest address not (I'm just rephrasing 1)