Skip to content

Instantly share code, notes, and snippets.

View tbg's full-sized avatar

Tobias Grieger tbg

View GitHub Profile
add-nodes 2 voters=(1,2,3)
----
INFO 1 switched to configuration voters=(1)
INFO 1 switched to configuration voters=(1 2)
INFO 1 switched to configuration voters=(1 2 3)
INFO 1 became follower at term 0
INFO newRaft 1 [peers: [1,2,3], term: 0, commit: 2, applied: 2, lastindex: 2, lastterm: 1]
INFO 2 switched to configuration voters=(1)
INFO 2 switched to configuration voters=(1 2)
INFO 2 switched to configuration voters=(1 2 3)
=== RUN Example_node
W190617 21:07:48.618723 1 server/status/runtime.go:310 [n?] Could not parse build timestamp: parsing time "" as "2006/01/02 15:04:05": cannot parse "" as "2006"
W190617 21:07:49.275779 94259 storage/store.go:1532 [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown
W190617 21:07:49.335282 94259 storage/store.go:1532 [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown
W190617 21:07:49.431817 94259 storage/store.go:1532 [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown
W190617 21:07:51.624489 94218 storage/store.go:3740 [n1,s1,r12/1:/Table/1{6-7}] handle raft ready: 0.6s [processed=1]
W190617 21:08:21.631685 94074 storage/closedts/provider/provider.go:152 [ct-closer] unable to move cl
diff --git a/pkg/server/updates.go b/pkg/server/updates.go
index 8275b361a3..c9afaa0304 100644
--- a/pkg/server/updates.go
+++ b/pkg/server/updates.go
@@ -454,19 +454,29 @@ func (s *Server) reportDiagnostics(ctx context.Context) {
return
}
addInfoToURL(ctx, reportingURL, s, report.Node)
- res, err := http.Post(reportingURL.String(), "application/x-protobuf", bytes.NewReader(b))
+
#37250:
F190524 23:53:29.817116 24309 kv/txn_coord_sender.go:913 [n1,client=10.128.0.82:40908,user=root] unexpected txn state: "sql txn" id=ca81044b key=/Table/54/1/804/0 rw=true pri=0.02619792 stat=COMMITTED epo=1 ts=1558742002.436278411,0 orig=1558742002.436278411,0 max=1558742002.936278411,0 wto=false seq=7 int=6
goroutine 24309 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0xc000057b01, 0xc000057b60, 0x557b400, 0x16)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:1020 +0xd4
github.com/cockroachdb/cockroach/pkg/util/log.(*loggingT).outputLogEntry(0x5d0a200, 0xc000000004, 0x557b485, 0x16, 0x391, 0xc0089f8400, 0xfc)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:878 +0x93d
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x3b6b180, 0xc0293c5b30, 0x4, 0x2, 0x3404e91, 0x18, 0xc0476450c0, 0x1, 0x1)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:85 +0x2d8
I190522 00:41:14.287997 1 rand.go:87 Random seed: 2857119843189535667
=== RUN TestChangefeedDataTTL
=== RUN TestChangefeedDataTTL/sinkless
W190522 00:41:14.295254 15 server/status/runtime.go:322 [n?] Could not parse build timestamp: parsing time "" as "2006/01/02 15:04:05": cannot parse "" as "2006"
I190522 00:41:14.304266 15 server/server.go:888 [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I190522 00:41:14.304507 15 base/addr_validation.go:279 [n?] server certificate addresses: IP=127.0.0.1,::1; DNS=localhost,*.local; CN=node
I190522 00:41:14.304543 15 base/addr_validation.go:319 [n?] web UI certificate addresses: IP=127.0.0.1,::1; DNS=localhost,*.local; CN=node
I190522 00:41:14.307296 15 server/config.go:506 [n?] 1 storage engine initialized
I190522 00:41:14.307323 15 server/config.go:509 [n?] RocksDB cache size: 128 MiB
I190522 00:41:14.307334 15 server/config.go:509 [n?] store 0: in-memory, size 0 B
|--n1/r1------|    |--n2/r2--------|
 (x, t1) |-> A                          write received from n1/r1
                     (y, t1) |-> B      write received from n2/r2

                       t1 closed        must continue buffering b/c n1
                                           may see writes for t1
     t1 closed                          can stop buffering (assuming the
                                        watched keyspace is not larger
 than the two ranges

We start with a an example which will result in split brain. Note that this example does not apply to the implementation in etcd/raft because of implementation details which we'll discuss later, but we will then present a way in which the modified example seems to apply to etcd/raft as well. For now, it is enough to imagine an implementation of Raft that fully follows the specs but happens to allow the history presented below.

The anomaly revolves around the fact that log entries may be committed without this being known to all peers (via the leader-communicated commit index).

I190430 04:44:04.301934 498 storage/store.go:4221 [n1,s1]
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
L0 1/0 23.80 MB 0.5 0.0 0.0 0.0 741.5 741.5 0.6 1.0 0.0 79.4 9567 32543 0.294 0 0
L2 15/0 61.50 MB 1.0 971.5 443.1 528.4 949.5 421.0 0.0 2.1 57.1 55.8 17413 9711 1.793 18G 216M
L3 23/0 87.70 MB 1.0 986.9 457.2 529.7 960.4 430.7 262.6 2.1 65.7 63.9 15391 41362 0.372 16G 184M
L4 164/0 909.52 MB 1.0 1798.7 357.6 1441.1 1785.5 344.3 336.6 5.0 65.5 65.0 28114 89925 0.313 32G 127M
L5
n1 is ok, leaseholder n4 has lost stats
andrei-1555535390-11-n8cpu4-geo-0005> I190417 22:21:41.970516 71187 storage/replica_command.go:252 [n5,s5,r869/1:/Table/5{7/1/760/…-9}] initiating a split of this range at key /Table/57/1/762/9/1509 [r871] (manual)
andrei-1555535390-11-n8cpu4-geo-0005> W190417 22:21:44.995099 71339 storage/allocator.go:639 [n5,s5,r871/1:/Table/5{7/1/762/…-9}] simulating RemoveTarget failed: must supply at least one candidate replica to allocator.RemoveTarget()
andrei-1555535390-11-n8cpu4-geo-0005> I190417 22:21:52.847210 71622 storage/replica_command.go:252 [n5,s5,r871/1:/Table/5{7/1/762/…-9}] initiating a split of this range at key /Table/57/1/770/7/339 [r875] (manual)
andrei-1555535390-11-n8cpu4-geo-0005> I190417 22:21:55.313645 71705 storage/store_snapshot.go:775 [n5,s5,r871/1:/Table/57/1/7{62/9/1…-70/7/3…}] sending preemptive snapshot 55b4997c at applied index 17
andrei-1555535390-11-n8cpu4-geo-0005> I190417 22:21:55.314066 71705 storage/store_snapshot.go:818 [n5,s5,r871/1:/Ta
$ ./cockroach debug range-data /mnt/data1/cockroach/auxiliary/checkpoints/2019-04-17T22\:27\:25Z/ --replicated 552
0.000000000,0 /Local/RangeID/552/r/AbortSpan/"9c665b54-5d7c-4444-bebb-db302e5e4687" (0x0169f70228726162632d129c665b545d7c4444bebbdb302e5e4687000100): key:"\001k\022\276\211\367\002\213\216\367\001z\000\001rdsc" timestamp:<wall_time:1555539826208979016 logical:1 > priority:200652
0.000000000,0 /Local/RangeID/552/r/RangeLastGC (0x0169f70228726c67632d00): : EMPTY
0.000000000,0 /Local/RangeID/552/r/RangeAppliedState (0x0169f70228727261736b00): raft_applied_index:55 lease_applied_index:26 range_stats:<last_update_nanos:1555539836245201052 sys_bytes:878 sys_count:7 >
0.000000000,0 /Local/RangeID/552/r/RangeLease (0x0169f7022872726c6c2d00): repl=(n6,s6):2 seq=10 start=1555539826.208979016,0 epo=1 pro=1555539826.208983358,0
0.000000000,0 /Local/RangeID/552/r/RangeTxnSpanGCThreshold (0x0169f70228727473742d00): : EMPTY
0.000000000,0 /Local/Range/Table/54/1/651/6/378/QueueLastProcessed/"consistencyChecker