Skip to content

Instantly share code, notes, and snippets.

@jburwell
Last active August 29, 2015 14:04
Show Gist options
  • Save jburwell/4921f80fc86d6d31f386 to your computer and use it in GitHub Desktop.
Save jburwell/4921f80fc86d6d31f386 to your computer and use it in GitHub Desktop.
Metrics Performance Test Plan

As we review replacing Folsom with Exometer for metrics collection within Riak and Riak Core, we must verify that Exometer's performance overhead is less than or equal to Folsom. We also need to understand the impact of Exometer's measurements as load increases to ensure that its overhead remains constant. In order to measure their respective overhead, Riak will be configured in the following manner to isolate metrics collection operations from other Riak work (e.g. GETs, PUTs, DELETEs, handoffs, etc):

  • Ring size: 32
  • Backend: yessir
  • AAE, Search, Yokozuna, and Strong Consistency disabled

In order to measure overhead, one instance of Basho Bench (b_b) executed the metrics_punisher configuration attached to this gist against a Riak cluster. A second instance of b_b will simultaneously execute the stats_query configured attached to this gist against the cluster querying the stats endpoint. Based on discussions with Russell and MvM, it is critical that these b_b instances will run on a separate, dedicated hosts against a Riak cluster running dedicated hosts separate from the b_b instances.

The initial round of testing will utilize a single node Riak cluster, and using the following Riak versions for comparison:

  • 2.0.0: Riak 2.0.0
  • 2.0.0-nullified: Riak 2.0.0 with the riak_kv_stats:update function modified to return ok rather than call into riak_core stats
  • feuer-exometer2: Riak branch with Exometer integration
  • feuer-exometer2-nullified: Riak branch with Exometer integration and the riak_kv_stats:update function modified to return ok rather than call into riak_core stats

For each of these versions, 0, 1, 2, 5, 10, and 50 stats client scenarios will be executed. The nullified instances will provide baseline values for determining the overhead of the respective stats subsystem. It is expected that the performance will be practically identical to 2.0.0 and that the Exometer overhead will be less than or equal to 2.0.0. During all test runs, message queue lengths will be monitored through etop to ensure that unbounded growth does not occur.

The shell script (perf-test.sh) that implements this process is attached to this gist.

#!/bin/bash
# Punisher runs on the current host
INITIAL_DIR=`pwd`
STATS_BENCH_HOST="r2s28"
RIAK_HOST="r2s29"
BENCH_HOME="$HOME/projects/basho_bench"
BENCH_CONF_DIR="$HOME/bench-conf"
RIAK_BASE="/usr/local/libexec/riak-instances"
ERLANG_PATH_CMD="export PATH=~/erlang/r16b02/bin:\$PATH"
SESSION_ID=`date +%Y%m%d-%H%M%S`
LOG_DIR=$INITIAL_DIR/logs/$SESSION_ID
echo -n "sudo password: "
read PASSWORD
SUDO_PREFIX="echo $PASSWORD | sudo -kS"
mkdir -p $LOG_DIR
for COUNT in 0 1 2 5 10 50; do
for SCENARIO in "nullified-exometer" "nullified-folsom" "folsom" "exometer"; do
echo "Testing $COUNT stats clients on the $SCENARIO cluster ..."
RIAK_HOME=$RIAK_BASE/$SCENARIO/dev/dev1
RIAK_BIN_DIR=$RIAK_HOME/bin
RIAK_DATA_DIR=$RIAK_HOME/data
RUN_NAME=$SESSION_ID-$SCENARIO-$COUNT
echo "Clearing data for Riak cluster $SCENARIO on $RIAK_HOST"
ssh $RIAK_HOST "$SUDO_PREFIX rm -rf $RIAK_DATA_DIR"
echo "Starting Riak cluster $SCENARIO on $RIAK_HOST"
ssh $RIAK_HOST "$ERLANG_PATH_CMD ; $SUDO_PREFIX killall -9 beam.smp ; cd $RIAK_BIN_DIR ; $SUDO_PREFIX ./riak start ; $SUDO_PREFIX ./riak-admin wait-for-service riak_kv"
if [ $COUNT -ne 0 ]; then
echo "Starting stats Basho Bench on $STATS_BENCH_HOST ..."
ssh $STATS_BENCH_HOST "$ERLANG_PATH_CMD ; cd $BENCH_HOME ; ./basho_bench $BENCH_CONF_DIR/stats_query-$COUNT.conf" > $LOG_DIR/stats-$RUN_NAME.log 2>&1 &
BENCH_PIDS="$!"
fi
echo "Starting punisher bench using $BENCH_HOME/basho_bench on localhost ..."
cd $BENCH_HOME
./basho_bench -n $RUN_NAME $BENCH_CONF_DIR/single-node-punisher.conf > $LOG_DIR/punisher-$RUN_NAME.log 2>&1 &
BENCH_PIDS="$! $BENCH_PIDS"
echo "Waiting for bench runs (pids: $BENCH_PIDS) to complete"
wait $BENCH_PIDS
unset BENCH_PIDS
done
done
cd $BENCH_HOME/tests
RESULTS_TARBALL=~/exometer-performance-results-$SESSION_ID.tar.gz
tar cvzf $RESULTS_TARBALL $SESSION_ID*
echo "Performance tests complete. Results collected in $RESULTS_TARBALL."
cd $INITIAL_DIR
{mode, max}.
{duration, 5}.
{concurrent, 200}.
{driver, basho_bench_driver_riakc_pb}.
{key_generator, {int_to_bin_bigendian, {pareto_int, 25000}}}. % 250mil for ~100gig ondisk
{value_generator, {fixed_bin, 500}}.
{value_generator_source_size, 1024}.
{riakc_pb_replies, 1}.
{riakc_pb_ips, [{"r2s29", 10201}]}.
{operations, [{get, 6},{update, 1}]}.
{mode, {rate, 1}}.
{duration, 60}.
{concurrent, 1}.
{driver, basho_bench_driver_http_raw}.
%% Example syntax (mykeygen_seq is not defined)
%% {key_generator, {function, test, mykeygen_seq, [10000, 10, 10, 100]}}.
{value_generator, {fixed_bin, 10000}}.
%% Example syntax (mysearchgen is not defined)
%% {http_search_generator, {function, test, mysearchgen, []}}.
{http_raw_request_timeout, 600000}.
{http_raw_ips, ["127.0.0.1"]}.
{http_raw_port, 8091}.
{operations, [{stat, 1}]}.
{source_dir, "foo"}.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment