TraceEvent can cause OOMs in simulation #2218
-
-
Save jzhou77/5247c1dbe820d9fcebb1fd1df825b7d5 to your computer and use it in GitHub Desktop.
Docker on Mac without docker-desktop
brew install docker docker-machine docker-credential-helper docker-compose virtualbox
(make sure ~/.docker/config.json has "credsStore" : "osxkeychain"
in it)
documentation build
ninja docpreview
starts a web server at a local port: e.g.,Serving HTTP on 0.0.0.0 port 14244 (http://0.0.0.0:14244/)
- Since this web server is in
okteto
, need to do port fowarding:
$ ssh -L 14244:localhost:14244 jzhou-dev.okteto
Or
$ kubectl get all
$ kubectl port-forward replicaset.apps/jzhou-dev-6df7457774 14244 14244
- Navigate to
http://localhost:14244/performance.html
.
(1) Install gdbgui using pip install gdbgui
(2) Forward the port used by gdbserver or gdbgui in Okteto environment to your local machine: kubectl port-forward pod/vishesh-dev-6d4f39f78-ngs8f 5000:5000
TraceEvent("TLogInitCommit", logData->logId).log();
wait(ioTimeoutError(self->persistentData->commit(), SERVER_KNOBS->TLOG_MAX_CREATE_DURATION));
217.908010 Role ID=f70414b30d93c824 As=TLog Transition=Begin Origination=Recovered OnWorker=3135ceefe717ee33 SharedTLog=853fab151d4a9085 GP,LR,MS,SS,TL
217.908010 TLogRejoining ID=f70414b30d93c824 ClusterController=50cd94d391dd1f0b DbInfoMasterLifeTime=50cd94d391dd1f0b#4 LastMasterLifeTime=0000000000000000#0 GP,LR,MS,SS,TL
217.908010 TLogStart ID=f70414b30d93c824 RecoveryCount=28 GP,LR,MS,SS,TL
217.908010 TLogInitCommit ID=f70414b30d93c824 GP,LR,MS,SS,TL
227.908010 IoTimeoutError ID=0000000000000000 Error=io_timeout ErrorDescription=A disk IO operation failed to complete in a timely manner ErrorCode=1521 Duration=10 BT=addr2line -e fdbserver.debug -p -C -f -i 0x1d9102b 0x249cfbb 0x249c324 0x2472a61 0x2479a69 0x24c2cac 0x2631811 0x261bb4d 0x13b1f2c 0x360e31b 0x360df73 0x11c6198 0x36dfda6 0x36dfc18 0x1b9d8c5 0x7f719aff6555 Backtrace=addr2line -e fdbserver.debug -p -C -f -i 0x3823549 0x3823801 0x381eb04 0x1d710d3 0x1d70d57 0x387bfb8 0x387bdde 0x11c6198 0x36dfda6 0x36dfc18 0x1b9d8c5 0x7f719aff6555 GP,LR,MS,SS,TL
227.908010 TLogError ID=853fab151d4a9085 Error=io_timeout ErrorDescription=A disk IO operation failed to complete in a timely manner ErrorCode=1521 GP,LR,MS,SS,TL
227.908010 Role ID=f70414b30d93c824 Transition=End As=TLog Reason=Error GP,LR,MS,SS,TL
227.908010 RoleRemove ID=0000000000000000 Address=2.0.1.2:1 Role=TLog NumRoles=7 Value=1 Result=Decremented Role GP,LR,MS,SS,TL
227.908010 Role ID=7fb6adb922290f7e Transition=End As=TLog Reason=Error GP,LR,MS,SS,TL
227.908010 RoleRemove ID=0000000000000000 Address=2.0.1.2:1 Role=TLog NumRoles=6 Value=0 Result=Removed Role GP,LR,MS,SS,TL
Anti-quorum is not used, because if you let a tlog fall behind other tlogs, all of the storage servers that read from that tlog will also fall behind, and then you will get future version errors from clients trying to read from those storage servers