Created
August 16, 2013 06:50
-
-
Save markpapadakis/6247810 to your computer and use it in GitHub Desktop.
Suggestion for improving reliability and increasing performance for GET/READ requests on Cassandra
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Suggestion for improving reliability and increasing performance for GET/READ requests on Cassandra | |
| ==================================================================================================== | |
| Each node maintains a [UUIDs] list, of all hosts said node has pending Hinted Handoff data to deliver, whenever possible. Whenever that list changes (all data delivered to one of those nodes, etc), it will issue an ApplicationState update for e.g APPSTATE_PENDING_HH so that all other nodes in the cluster will be aware of it. | |
| Each node will maintain a UUID=>[UUIDs], that is which nodes have 1 or more nodes that have pending HH content to deliver to it. | |
| Effectively, by knowing that a node A doesn't have any pending HH content to be delivered to any one, you can account for that when building the list of nodes to ask for digests for read repair. e.g | |
| for (uint32_t i = 0; i != liveNodesForKey.size(); ) | |
| { | |
| if (time() > timeSinceGossipInitialized + 5 * 10 && pendingHH[liveNodesForKey[i].hostIdD] == 0) | |
| { | |
| liveNodesForKey.PopByIndex(i); | |
| continue; | |
| } | |
| else | |
| ++i; | |
| } | |
| We just make sure we don't take this into account immediately, give it a few minutes to make sure gossip chatter has propagated across all nodes, and then just remove nodes we mean to ask for CF digests if we are fairly certain there is no missing content across the, because there is no pending HH to deliver to them. | |
| This has worked well for us, and I believe it would benefit Cassandra users who would have like to set read repair chance to 1.0 and at the same time be fairly certain this won't cause any problems. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment