Skip to content

Instantly share code, notes, and snippets.

View wdberkeley's full-sized avatar

William Berkeley wdberkeley

View GitHub Profile
@wdberkeley
wdberkeley / ksck_no_tables_checksum.json
Created May 2, 2018 23:20
ksck json no tables checksum
$ bin/kudu cluster ksck -checksum_scan -checksum_snapshot=false -ksck_format=json localhost:7051,localhost:7052,localhost:7053
{
"master_health": [
{
"uuid": "c78e5cede62f48bd9975cc27e0bd4870",
"address": "localhost:7051",
"health": "HEALTHY",
"status": "OK"
},
{
@wdberkeley
wdberkeley / ksck_no_tables.json
Created May 2, 2018 23:19
ksck json no tables
$ bin/kudu cluster ksck -ksck_format=json localhost:7051,localhost:7052,localhost:7053
{
"master_health": [
{
"uuid": "c78e5cede62f48bd9975cc27e0bd4870",
"address": "localhost:7051",
"health": "HEALTHY",
"status": "OK"
},
{
@wdberkeley
wdberkeley / ksck_no_tables.json
Created May 2, 2018 23:18
ksck json no tables
{
"master_health": [
{
"uuid": "c78e5cede62f48bd9975cc27e0bd4870",
"address": "localhost:7051",
"health": "HEALTHY",
"status": "OK"
},
{
"uuid": "7b2c2970d9fa482c8ac7c06546b30323",
@wdberkeley
wdberkeley / rb_sim.py
Created April 11, 2018 18:41
Rudimentary balancing sim
import argparse
import random
# Number of pool slots per TS.
POOL_SLOTS_PER_TS=10
# Maximum number of time steps it takes a move to complete.
MAX_MOVE_STEPS = 5
# Event types.
@wdberkeley
wdberkeley / rb_sim.py
Created April 11, 2018 08:33
Rudimentary balancing simulation
# Return the table skew of 'table'.
# Table skew is (# replicas on TS with most replica) - (# replicas on TS with least replica).
# This modifies table so it is sorted by # of replicas, increasing.
def table_skew(table):
table.sort()
return table[-1] - table[0]
# Return the next move that should be done to balance table, encoded as (i, j)
# where i is the index of the TS to move from and j is the index of the TS to
# move to.
@wdberkeley
wdberkeley / regex_version_example.cc
Created February 22, 2018 19:51
c++11 regex and version stripping
#include <regex>
#include <iostream>
#include <string>
using namespace std;
int main() {
string input;
regex integer("([[:digit:]]+.[[:digit:]]+.[[:digit:]]+)(-.*)?");
smatch matches;
@wdberkeley
wdberkeley / gist:b9e53ac42dee56f0b4c1b38b470f7197
Last active January 11, 2018 22:58
kudu-spark authn issue repro
1. Disable authentication
--rpc_authentication=disabled
--rpc_encryption=disabled
2. Create a table
> kudu perf loadgen --keep_auto_table --table_num_replicas=3 kudu-master
Using auto-created table 'loadgen_auto_6579615914f7441b808db3a03ae88aae'
@wdberkeley
wdberkeley / gist:b485604f66de2da7f80a692bc4dcdcc4
Created January 2, 2018 18:20
How to get info about on-disk columns
1. Get a tablet id for a tablet from the table you're interested in.
The easiest way to do this is usually to to use the /tablets endpoint of the webui.
2. Run the following command to find the blocks for this tablet stored on the tablet server:
$ kudu local_replica dump rowset --fs_wal_dir=<wal dir> --fs_data_dirs=<data dirs> -metadata-only <tablet id>
You'll see something like
Dumping rowset 0
----------------------------------------------------------------------
@wdberkeley
wdberkeley / scaling.adoc
Created December 5, 2017 19:55
Apache Kudu Scaling Doc

Apache Kudu Scaling

@wdberkeley
wdberkeley / gist:6283a7a7a0c6aee42e7252f120d03ac4
Created December 5, 2017 19:08
Apache Kudu 1.6.0-RC1 TabletCopyClientTest.TestDownloadBlock Failure
[ RUN ] TabletCopyClientTest.TestDownloadBlock
W1205 10:17:49.985105 1999118336 system_unsync_time.cc:38] NTP support is disabled. Clock error bounds will not be accurate. This configuration is not suitable for distributed clusters.
I1205 10:17:49.985236 1999118336 server_base.cc:229] Could not load existing FS layout: Not found: /private/tmp/kudutest-502/tablet_copy_client-test.TabletCopyClientTest.TestDownloadBlock.1512497869518041-53106/TabletServerTest-fsroot/data-0/instance: No such file or directory (error 2)
I1205 10:17:49.985250 1999118336 server_base.cc:230] Creating new FS layout
I1205 10:17:49.991677 1999118336 fs_manager.cc:567] Generated new instance metadata in path /private/tmp/kudutest-502/tablet_copy_client-test.TabletCopyClientTest.TestDownloadBlock.1512497869518041-53106/TabletServerTest-fsroot/data-0/instance:
uuid: "c0b5ced5c9ca4d75819b8854b477855e"
format_stamp: "Formatted at 2017-12-05 18:17:49 on dhcp-10-16-0-202.pa.cloudera.com"
I1205 10:17:49.991961 1999118336 fs_manager.cc:567]