This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Save this to, say, andy-rebalance-experiment.sh | |
# run `chmod +x andy-rebalance-experiment.sh` (once) | |
# To run, run `./andy-rebalance-experiment.sh`. | |
set -euxo pipefail | |
export CLUSTER=andy-rebalance | |
roachprod create $CLUSTER -n 4 --clouds=aws --aws-machine-type-ssd=c5d.4xlarge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>>> roachprod stop: Wed Apr 3 06:15:02 UTC 2019 | |
PID COMMAND | |
1 /sbin/init HOME=/ init=/sbin/init NETWORK_SKIP_ENSLAVED= recovery= TERM=linux drop_caps= BOOT_IMAGE=/boot/vmlinuz-4.15.0-1026-gcp PATH=/sbin:/usr/sbin:/bin:/usr/bin PWD=/ rootmnt=/root | |
2 [kthreadd] | |
3 [kworker/0:0] | |
4 [kworker/0:0H] | |
5 [kworker/u8:0] | |
6 [mm_percpu_wq] | |
7 [ksoftirqd/0] | |
8 [rcu_sched] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
digraph G { | |
"Start" -> "Dead Node" | |
"Start" -> "Unresponsive Node" | |
"Dead Node" -> "OOM" | |
"OOM" -> "heap_profiler" | |
"OOM" -> "goroutine_dump" | |
"OOM" -> "dmesg" | |
"OOM" -> "log messages" | |
"Dead Node" -> "Fatal error" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cr.store.totalbytes 1 | |
1553683040000000000 8520 | |
1553683050000000000 65054 | |
1553683060000000000 93149 | |
... |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
07:31:21 cluster.go:252: > /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod create teamcity-1194326-election-after-restart -n 3 --gce-machine-type=n1-standard-4 --local-ssd-no-ext4-barrier | |
Creating cluster teamcity-1194326-election-after-restart with 3 nodes | |
teamcity-1194326-election-after-restart: [gce] 12h28m4s remaining | |
teamcity-1194326-election-after-restart-0001 teamcity-1194326-election-after-restart-0001.us-east1-b.cockroach-ephemeral 10.142.0.80 35.237.52.36 | |
teamcity-1194326-election-after-restart-0002 teamcity-1194326-election-after-restart-0002.us-east1-b.cockroach-ephemeral 10.142.0.44 35.196.152.243 | |
teamcity-1194326-election-after-restart-0003 teamcity-1194326-election-after-restart-0003.us-east1-b.cockroach-ephemeral 10.142.0.23 34.73.18.251 | |
Syncing... | |
failed to update roachprod.crdb.io DNS: Command: gcloud [--project cockroach-shared dns record-sets import -z roachprod --delete-all-existing --zone-file-format /root/.roachprod/dns.bind932780071] | |
Output: ERROR: (gcloud |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(lldb) thread backtrace all | |
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP | |
* frame #0: 0x00007fff75e2d1b2 libsystem_kernel.dylib`__psynch_cvwait + 10 | |
frame #1: 0x00007fff75ee65fe libsystem_pthread.dylib`_pthread_cond_wait + 775 | |
frame #2: 0x0000000004060754 cockroach`runtime.pthread_cond_timedwait_relative_np_trampoline + 20 | |
frame #3: 0x000000000405e180 cockroach`runtime.asmcgocall + 112 | |
frame #4: 0x000000000404e06b cockroach`runtime.pthread_cond_timedwait_relative_np + 59 | |
frame #5: 0x000000000402d49d cockroach`runtime.semasleep + 269 | |
frame #6: 0x000000000400ca5d cockroach`runtime.notetsleep_internal + 269 | |
frame #7: 0x000000000400ccc1 cockroach`runtime.notetsleepg + 97 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
recap: n3 had good stats and crashed because it talked to n2 which has bad stats. n4 is still around and has the good stats. | |
n2 is andy-72:6 | |
n3 is andy-72:3 | |
# n2 | |
~/cockroach debug rocksdb --hex query --db=./cockroach | |
get 0x0169f70150727261736b00 | |
0x0169F70150727261736B00 ==> 0x120408001000180020002800325B6ACFE8BF03083B101C1A50096C5D564CA220891520A39895FDFFFFFFFFFF0128B2E6FBFFFFFFFFFFFF013086E6ECFEFFFFFFFFFF0138B2E6FBFFFFFFFFFFFF01409DB2A8FEFFFFFFFFFF0148B2E6FBFFFFFFFFFFFF0160B7066805 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Routine | |
510: select [0~2 minutes] [Created by internal.go:206 (*internalExecutorImpl).initConnEx({[] [] false})] | |
Stack | |
- replica_range_lease.go:911 (*Replica).redirectOnOrAcquireLease.func2({[{824884140400 *} {824636858368 #402} {122128928 #92} {824888890480 *} {824784257152 *} {824835119768 *} {824835119768 *}] [] false}) | |
- replica_range_lease.go:967 (*Replica).redirectOnOrAcquireLease({[{824636858368 #402} {122128928 #92} {824888890480 *} {0 } {0 } {0 } {0 } {0 } {0 } {0 }] [] true}) | |
- replica_read.go:40 (*Replica).executeReadOnlyBatch({[{824636858368 #402} {122128928 #92} {824888890480 *} {1551788088188014000 *} {0 } {4294967297 #4} {1 } {6 } {0 } {824876587776 *}] [] true}) | |
- replica.go:517 (*Replica).sendWithRangeID({[{824636858368 #402} {122128928 #92} {824888890480 *} {6 } {1551788088188014000 *} {0 } {4294967297 #4} {1 } {6 } {0 }] [] true}) | |
- replica.go:462 (*Replica).Send({[{824636858368 #402} {122128928 #92} {824888890432 *} {1551788088188014000 *} {0 } {4294967297 #4} {1 } {6 } {0 } {824876587776 |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
3 runs so far, 0 failures, over 5s | |
3 runs so far, 0 failures, over 10s | |
3 runs so far, 0 failures, over 15s | |
3 runs so far, 0 failures, over 20s | |
3 runs so far, 0 failures, over 25s | |
3 runs so far, 0 failures, over 30s | |
3 runs so far, 0 failures, over 35s | |
3 runs so far, 0 failures, over 40s | |
3 runs so far, 0 failures, over 45s | |
3 runs so far, 0 failures, over 50s |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SIGABRT: abort | |
PC=0x7fff731101b2 m=0 sigcode=0 | |
goroutine 0 [idle]: | |
runtime.pthread_cond_wait(0x857f580, 0x857f540, 0x0) | |
/usr/local/Cellar/go/1.11.1/libexec/src/runtime/sys_darwin.go:302 +0x51 | |
runtime.semasleep(0xffffffffffffffff, 0x857f200) | |
/usr/local/Cellar/go/1.11.1/libexec/src/runtime/os_darwin.go:63 +0x85 | |
runtime.notesleep(0x857f340) | |
/usr/local/Cellar/go/1.11.1/libexec/src/runtime/lock_sema.go:167 +0xe3 |