Skip to content

Instantly share code, notes, and snippets.

@shino
Created July 22, 2015 15:11
Show Gist options
  • Save shino/dd9a75e84b2b5792a079 to your computer and use it in GitHub Desktop.
Save shino/dd9a75e84b2b5792a079 to your computer and use it in GitHub Desktop.
Operation steps when claimant node is down in Riak cluster

Created 3-node healthy cluster. Node dev1 is claimant at this point.

% dev/dev1/bin/riak-admin member-status
================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      50.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

% dev/dev1/bin/riak-admin ring-status
================================== Claimant ===================================
Claimant:  '[email protected]'
Status:     up
Ring Ready: true

============================== Ownership Handoff ==============================
No pending changes.

============================== Unreachable Nodes ==============================
All nodes are up and reachable

Then kill node dev1 forcefully

% DEV1=`ps auw | grep beam | grep riak | grep dev1 | awk '{print $2;}'`; echo $DEV1
64693
% kill -9 $DEV1
% dev/dev2/bin/riak-admin member-status
================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      50.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Check ring status, claimant is down

% dev/dev2/bin/riak-admin ring-status
================================== Claimant ===================================
Claimant:  '[email protected]'
Status:     down
Ring Ready: unknown

============================== Ownership Handoff ==============================
No pending changes.

============================== Unreachable Nodes ==============================
The following nodes are unreachable: ['[email protected]']

WARNING: The cluster state will not converge until all nodes
are up. Once the above nodes come back online, convergence
will continue. If the outages are long-term or permanent, you
can either mark the nodes as down (riak-admin down NODE) or
forcibly remove the nodes from the cluster (riak-admin
force-remove NODE) to allow the remaining nodes to settle.

As expected cluster operation fails

% dev/dev4/bin/riak-admin cluster join [email protected]
Success: staged join request for '[email protected]' to '[email protected]'

% dev/dev2/bin/riak-admin cluster plan
RPC to '[email protected]' failed: {'EXIT',
                                 {{nodedown,'[email protected]'},
                                  {gen_server,call,
                                   [{riak_core_claimant,'[email protected]'},
                                    plan,infinity]}}}

Mark node dev1 as down from node dev2, then claimant changes.

% ./dev/dev2/bin/riak-admin down [email protected] Success: "[email protected]" marked as down % dev/dev2/bin/riak-admin ring-status ================================== Claimant =================================== Claimant: '[email protected]' Status: up Ring Ready: true

============================== Ownership Handoff ============================== No pending changes.

============================== Unreachable Nodes ============================== All nodes are up and reachable

Retry cluster plan, it succeeds this time.

% dev/dev2/bin/riak-admin cluster plan
=============================== Staged Changes ================================
Action         Details(s)
-------------------------------------------------------------------------------
join           '[email protected]'
-------------------------------------------------------------------------------


NOTE: Applying these changes will result in 1 cluster transition

###############################################################################
                         After cluster transition 1/1
###############################################################################

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
down       50.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid       0.0%      --      '[email protected]'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:1

WARNING: Not all replicas will be on distinct nodes

Commit can be done. Hooray!

% dev/dev2/bin/riak-admin cluster commit
Cluster changes committed
% dev/dev2/bin/riak-admin member-status
================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
down       50.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid      25.0%      --      '[email protected]'
valid       0.0%      --      '[email protected]'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:1
% dev/dev2/bin/riak-admin ring-status
================================== Claimant ===================================
Claimant:  '[email protected]'
Status:     up
Ring Ready: true

============================== Ownership Handoff ==============================
No pending changes.

============================== Unreachable Nodes ==============================
All nodes are up and reachable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment