Skip to content

Instantly share code, notes, and snippets.

@angrycub
Last active September 21, 2024 19:19
Show Gist options
  • Select an option

  • Save angrycub/8363931 to your computer and use it in GitHub Desktop.

Select an option

Save angrycub/8363931 to your computer and use it in GitHub Desktop.
Using KV Repair to Change a Cluster's n-Val

Using KV Repair to Change a Cluster's n-Val or "Sometimes Three's a Crowd"

The shipping default for Riak is to build a cluster that stores three replicas of any value stored in it. While this is awesome for fault tolerance, in some cases where storage is tight you might need to loosen up and store fewer replicas.

"Nothing to it, I just go in and change the default n-val, right? Easy peasy."

Well, not so fast! Coverage queries are going to break and you're going to leave crufty, orphaned replicas in your datafiles, and, let's be honest, you're probably doing this to save space. So what to do?

If you've been using Riak for any time at all, you are probably familiar with the Rolling Upgrade procedure. Using this same pattern, we can proceed through the cluster and safely wipe the data from a node and then repopulate it from replica data. Using Riak KV Repair will enable us to only replicate the data onto the nodes that should be primary owners. This will effectively allow us to ensure that there are two and only two replicas within the cluster. More information on riak_kv:repair can be found at "Running a Repair" in the Basho Documentation.

Since you'd want to beta-test this technique before running it in production, we will set up a Riak development release (devrel) using the instructions in the "Five Minute Install"

Create a cluster from a 4 node devrel

From the dev directory of your built Riak devrel, run:

for i in {1..4}; do dev$i/bin/riak start; done
for i in {2..4}; do dev$i/bin/riak-admin cluster join [email protected] ; done
dev1/bin/riak-admin cluster plan
dev1/bin/riak-admin cluster commit

Load example data into the cluster

wget https://raw.github.com/basho/basho_docs/master/source/data/goog.csv 
wget https://raw.github.com/basho/basho_docs/master/source/data/load_data.erl
chmod +x load_data.erl
./load_data goog.csv

Note: You might need to adjust the port that the load_data.erl script expects Riak at.

Once the data has loaded, you can count the records in the database with the following MapReduce job. I wouldn't run this in production, this is for illustrative purposes and could make your cluster a little sad. At the very least in a cluster with large datafiles, this will have you waiting for quite a while.

[user@localhost etc]$ curl -XPOST localhost:8091/mapred -H 'Content-Type: application/json' -d '{"inputs":"goog","query":[{"reduce":
{"language":"erlang","module":"riak_kv_mapreduce","function":"reduce_count_inputs","arg":{"do_prereduce":true}}}]}' -w '\n'
[1438]
  • Change the default n-value for the cluster by editing the app.config

  • Stop the node

    [user@localhost etc]$ ../bin/riak stop
    ok
    
  • Remove the data from the active backend

    [user@localhost etc]$ rm -rf ../data/bitcask/*
    
  • Start the node

    [user@localhost etc]$ ../bin/riak start
    [user@localhost etc]$ ../bin/riak-admin transfers
    '[email protected]' waiting to handoff 48 partitions
    
    Active Transfers:
    
    
    [user@localhost etc]$ ../bin/riak-admin transfers
    No transfers active
    
    Active Transfers:
    
    
    
  • Run riak_kv:repair

    [user@localhost etc]$ ../bin/riak attach
    Attaching to /tmp//home/user/riak/dev/dev3/erlang.pipe.1 (^D to exit)
    ([email protected])1> {ok, Ring} = riak_core_ring_manager:get_my_ring(),
    ([email protected])1> Node = node(),
    ([email protected])1> Partitions = [P || {P, Node} <- riak_core_ring:all_owners(Ring)],
    ([email protected])1> [riak_kv_vnode:repair(P) || P <- Partitions],
    ([email protected])1> ok.
    

Wait for all of the transfers to settle down. Once they have, you can proceed to the next box.

Once you have completed all of the nodes and the cluster has returned to steady, you can rerun your coverage query test and will receive the correct count consnstently.

[user@localhost etc]$ curl -XPOST localhost:8091/mapred -H 'Content-Type: application/json' -d '{"inputs":"goog","query":[{"reduce":
{"language":"erlang","module":"riak_kv_mapreduce","function":"reduce_count_inputs","arg":{"do_prereduce":true}}}]}' -w '\n'
[1438]

To wrap this up...

COMMANDS:

1:

curl -XPOST localhost:8091/mapred -H 'Content-Type: application/json' -d '{"inputs":"goog","query":[{"reduce":{"language":"erlang","module":"riak_kv_mapreduce","function":"reduce_count_inputs","arg":{"do_prereduce":true}}}]}' -w '\n'

2:

{ok, Ring} = riak_core_ring_manager:get_my_ring(),
Node = node(),
Partitions = [P || {P, Node} <- riak_core_ring:all_owners(Ring)],
[riak_kv_vnode:repair(P) || P <- Partitions],
ok.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment