In this project, a Primary-Backup protocol was implemented to provide a fault-tolerant KeyValue store. The KeyValue store inherited the behavior from Project 1 and provided At Most Once semantics. The Primary and Backup servers communicate using a Chain Replication protocol. In addition, the Primary server is responsible for managing the state transfer to the Backup server. And on top of that, there is a View Server that monitors the state of the parties and suggests an optimal view of the Primary and Backup in case of server failures and network partitions. The implemented system guarantees the linearizability of the commands and the consistency of the data stored in the Primary and Backup services.
- When the View Server starts it creates an initial view with the view number equal to
STARTUP_VIEWNUM = 0and nullable Primary and Backup and sets aPingCheckTimerto check the availability of the Primary and Backup servers. - When the View Server receives a ping from service A performs the following operations:
- if the current primary is not set, it sets the service A as the Primary
- if the Primary is set but there is no backup, it sets the service A as the Backup
- if the Ping is from the Primary and contains the view number that was not committed, the View Server marks the view as committed and sets a new view with the view number equal to the number of the view received
- if the Ping is from neither Primary nor Backup, the View Server marks this service as alive and ready to become a new Backup
- sets a flag signifying that the service A is alive
- returns the current view to service A if the view for service A has changed ever since the last ping
- When the
PingCheckTimerfires, the View Server performs the following operations:- if the Backup becomes unavailable, the View Server sets the Backup to another available service as the Backup that is alive and sends pings. If there are no available services, the View Server sets the Backup to null
- if the Primary becomes unavailable, the View Server promotes the Backup (if available) to the Primary. The View Server sets the Backup to another available service as the Backup that is alive and sends pings. If there are no available services, the View Server sets the Backup to null.
- if the Primary becomes unavailable and there is no Backup the View Server does nothing.
- when the View Server adds a new view, it makes sure it is one view ahead of the view committed by the Primary.
- resets the
PingCheckTimer.
- When the Client starts it sends a request to the View Server to get the current view of the Primary and Backup servers.
In addition, it sets a GetViewTimer to guarantee a retry mechanism in case of a timeout. The GetViewTimer
is being reset until a view with a target view number is received. In the beginning the target view number is
INITIAL_VIEWNUM = 1. Once the view is received, the Client updates the current view attribute if only the view received overtakes the current view of the Primary/Backup servers. - Once a user calls a command from the Client, the Client packs a request containing
a command itself, a sequence number, and a current view number and sends it to the Primary. In addition,
the client sets a
ClientTimerto guarantee a retry mechanism in case of a timeout. TheClientTimerkeeps the request to repeat as an attribute.- Once the
ClientTimerfires, the client resends the request and resets the timer in case, the response from the server is empty and the sequence number of the request sent is the same as the last one. - Once the Client receives a response, it checks whether the response's sequence number corresponds to the sequence number the Client anticipates. If so, the Client accepts the response and sends a new request. If not, the Client discards the response.
- Once the
- The Client also supports a force request view operation. Specifically, if the Client receives
a
ForceViewRequestmessage containing a target view number that is greater than the current view number, the Client requests the view from the View Server and sets aGetViewTimer.- The
GetViewTimerkeeps requesting the target view and is being reset until a view with a target view number is received.
- The
- When the Server starts it sends a ping to the View Server to announce its availability and sets a
PingTimerto send pings to the View Server. - The Client also supports a force request view operation. Specifically, if it receives
a
ForceViewRequestmessage containing a target view number that is greater than the current view number, the Server sends Ping to the View Server to recieve a new view.
- Primary sends Pings to the View Server to announce its availability. When the Primary receives a vie in return, it checks
whether the view changed. If so, and there is a new backup, the Primary sends a state transfer request to the Backup.
In addition, it sets a
PushStateTimerto make sure the state transfer is completed. ThePushStateTimeris being reset until the state transfer is completed. - When the Primary receives a request (that contains a view number) from the Client, it
- checks whether the view number of the request is the same as the current view number.
- If the Client's view number is behind the Primary's view number, the Primary sends a
ForceViewRequestmessage to the Client. - If the Primary's view number is behind the Client's view number, the Primary sends a ping to the Vie server to get the latest view.
- If the Client's view number is behind the Primary's view number, the Primary sends a
- if the Primary is in the state transfer, the request is rejected.
- if the Primary is backing up the previous request, the new one is rejected either
- if the request has a sequence number of the command that has already been applied, it returns the result of that command exploiting the application's idempotency.
- if there is no Backup in the current view, the command from the request is applied and the result is sent to the Client
- if there is a Backup, the Primary packs a BackupRequest containing the command and sends it to the Backup. In addition,
it sets a
BackupTimerto guarantee the receipt of the request by the Backup. It adds the current request to its state (requestToSyncproperty) to resend it in case of a timeout and verify the authenticity of the response the Primary expects from the Backup.
- checks whether the view number of the request is the same as the current view number.
- If the
BackupTimerfires, the Primary- checks whether there is a request to resed.
- if there is a request to resend, was it changed ever since the last time the timer fired
- resends the request to the Backup if the Primary is not in the state transfer
- resets the timer if there is still a Backup to synchronize with, otherwise the Primary processes the request and sends the result to the Client.
- If the Backup receives a new view in which it realizes it is a new Backup, it updates its state with the address
of the primary (
expectsStateFromproperty) as a service to expect a state from. - If the Backup receives a backup request from the Primary
- it checks whether it expects the state from the Primary. If so, the backup request is rejected
- it checks whether the request's sender is the Primary. If not, the request is rejected
- it checks whether the view number the request contains is the same as the current view number.
- If the Backup's view number is behind the Primary's view number, the Backup sends a
ForceViewRequestmessage to the Primary. - If the Primary's view number is behind the Backup's view number, the Backup sends a ping to the Vie server to get the latest view.
- If the Backup's view number is behind the Primary's view number, the Backup sends a
- applies the command sent within the request and sends the result to the Primary
KeyValue store implementation is inherited from Project 1 with one exception - it keeps so-called
state identifier (int stateCounter = 0 property) that is incremented with each unsafe (Put or Append) command
applied to the store. It is used to guarantee that the state of the server is not overridden by an outdated version
of the KeyValue store while transferring the state.
View Server keeps a linked list of the views that were suggested to the Primary and Backup servers. It adds a view on
top of the list if necessary. It cannot add more than two views ahead of the last committed view.
For example, if the tail of the view list is View(3) --> View(4) --> View(5), one of the option
is that
- View(3) is the last committed view
- View(4) is the view suggested and yet to be committed
- View(5) is the view that will be suggested once Primary acknowledges the view(4) View Server can not commit the view if the difference between the acknowledged view number and the last committed view number is not 1.
View Server keeps a map of the services' availability - a HashMap with a service name as a key and a boolean
to denote whether a server is available or not. It is set to true when the View Server receives a ping from the service.
It is set to false on the PingCheckTimer.
View Server keeps the last committed view number (int lastCommittedViewNum = STARTUP_VIEWNUM property) to track the last view number
of the committed view. It is used to make sure that the View Server does not get more than one view ahead of the servers.
In addition, View Server keeps a map of the services' last committed view number. It is used to discard stale pings
from the servers, that is, the pings that contain the view number that is less than the last committed view.
If the positions of the Primary or Backup are not set, the View Server sets the Ping sender as a Primary or Backup depending on the free position. One of the good design decisions I find pretty useful is that the View Server does not send a ViewReply to a service if the view numbers from the ping and the view it is going to send back are the same. It reduces the noise in the communication channels and lets the recipient Server do something else.
If the Backup becomes unavailable, the View Server chooses another backup based on the conditions as follows:
- new backup must be available (that is present in the availability map)
- the last acknowledged view by a new backup must be the same as the last committed view, otherwise the new backup needs to wait until it catches up with the View Server
The Client keeps a current view and the target view. The current view is the view with the greatest number received from the View Server. The target view is the view the Client is trying to reach. The target view can be established as follows:
- on the startup, the target view is
INITIAL_VIEWNUM = 1- when the Client receives a
ForceViewRequest. It can be sent by primary service if it detects that the Client's view gets behind. Every time the Client's target view number than that of the current view, the Client requests the view from the View Server and sets aGetViewTimerto guarantee the receipt of the view. This is done to make sure that the Client's view is always up to date with the Primary and Backup servers. In addition, the Client sends aGetViewrequest each time it does not hear from over three intervals corresponding to theClientTimer.
- when the Client receives a
The rest of the Client's logic is inherited from Project 1.
A request from a Client is handled by the Primary if:
- the Primary is not in the state transfer
- the Primary is not backing up the previous request If the Primary deduces that its view is not as same as the Client's, it:
- sends a
ForceViewRequestmessage to the Client if the Client's view is behind the Primary's view - sends a ping to the View Server to get the latest view if the Primary's view is behind the Client's view
The Primary accepts the request but does not back it up in case if there is no backup and if the request
has already been processed. Otherwise, the Primary packs a BackupRequest and sends it to the Backup setting the
BackupTimerto guarantee the receipt of the request by the Backup. When the Primary receives a response from the Backup it needs to verify the legitimacy of the response to discard outdated/strayed responses.
When the BackupTimer fires, the Primary resends the request to the Backup if:
- the Primary is not in the state transfer
- the request to resend is the same as the last request to resend However, if the backup is no longer available, the Primary processes the request and sends the result to the Client. If the backup has been changed, the Primary will try to back up the request to the new Backup to guarantee consistency.
If the Primary receives a new view which is a new Backup, it sends a state transfer request to the Backup to guarantee
the consistency of the data. Since there may be requests to back up in transfer, the Primary stops waiting for the
acknowledgment of the request from the Backup and sends the response to the Client. This is done to
prevent the Primary/Backup stat divergence. Additionally, it does not make sense to wait because the Primary
is about to send the result of the request execution as a part of the state transfer.
To make sure the Backup receives the state, the Primary sets a PushStateTimer to repeatedly check whether the acknowledgment
of the state (StateAck message) is received. The PushStateTimer is reset until the state transfer is completed.
If the Primary is in state transfer mode, but the view is changed so that there is a new backup, the Primary sends a state transfer
to the new Backup within the same PushStateTimer handler.
Once the Backup receives a new view in which is a new Backup, it sets the expectsStateFrom property to the address of the Primary and
expects the state transfer. It will accept the state from the Primary if:
expectsStateFromis not null- the Primary address is equal to the
expectsStateFromproperty - if the view number of the Primary passed along with the state is greater or equal to the current view number of the Backup
- if the state identifier of the Primary is greater than the current state identifier of the Backup
To distinguish between the regular and backup requests, the BackupRequest message is introduced. It reflects
the structure of the Request message.
The backup request is handled if:
- the backup is not in the state transfer
- if the view number of the Primary passed along with the Request is the same as the current view number of the Backup
- if the Primary view is behind the Backup view, the Backup sends a
ForceViewRequestmessage to the Primary - if the Backup view is behind the Primary view, the Backup sends a ping to the View Server to get the latest view
- if the Primary view is behind the Backup view, the Backup sends a
- There is no support of the state in which Primary and Backup are in the state transfer but the Primary becomes unavailable so that the Backup is left without the relevant state but needs to be assigned as a new Primary.
I worked on this project in advance using the following resources:
- DSLabs. Lab 2: Primary-Backup Service
- Discussion Section Slides from DSLabs
- Chain Replication for Supporting High Throughput and Availability
- Replication in the Harp File System
- Some paragraphs were taken from the report of Project 1 because the corresponding pieces of logic were not changed.
- Github Copilot as an assistant in writing the report.