One constraint: we are not able to sync time. So we just do best effort.
##Solution: ###Changes in raft
- Add a sync command in raft
type SyncCommand struct {
Time time.Time `json:"time"`
}- Register a sync function in raft
func sync(s Server, now time.Time)Each raft server will register with a sync function or a nil sync function. Raft leader will send a sync command periodically, which triggers the sync function.
###Changes in etcd
-
Add a sync function
deleteExpiredKey(s Server, cutoff time.Time)This function will be triggered by receiving sync command. And it removes all the expired keys based on the cutoff time and sends out notifications. So the expire events will happen at index of the sync command. -
Watcher can receive multiple events at one time, since there might be serveral watched keys expired at one sync command index. We send
nilto terminate a set of expire events.
##Guarantee
-
All expiration will happen at the same index among nodes. The events are totally consistent based on our logic clock[raft index].
-
Watch from index will be consistent and will not miss any events.
##Side effect:
-
The expired keys are deleted periodically rather than at the exact expire time. This should not be a problem as long as we send sync at millisecond level, since the ttl is at second level.
-
If the follower is disconnected with the leader, the expired key may not be deleted. To alleviate this we can set a
sync time. If the follower have not received anything from leader for async time, it should clear all the current watchers and do not accept any commands(including get and watch).