Main thread: https://www.cprogramming.com/debugging/segfaults.html
Steps:
- Run Docker with priviledged:
sudo nvidia-docker run -it --rm --privileged --net=host -v $PWD:/paddle -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY paddle:dev bash
- Get rid of read-only file system:
mount -o remount,rw /
- Make core dump path as the current path
bash -c 'echo core.%e.%p > /proc/sys/kernel/core_pattern'
- Install
gdb
:apt-get install gdb
- Run core dumped executable:
./op_registry_test
- Look for core.dump file, in my case,
core.op_registry_tes.72
- Use
gdb
to exam:gdb op_registry_test core.op_registry_tes.72
- Inside gdb:
(gdb) backtrace
#0 0x0000000000d269a0 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1 0x0000000000d26d6c in google::LogMessage::LogMessage(char const*, int) ()
#2 0x0000000000d2501d in paddle::fluid::framework::Accelerator::Accelerator (this=0x7ffffc057ef0, type=0xdc6f1b "CPU")
at /paddle/experimental/fluid/paddle/fluid/framework/accelerator.cc:38
#3 0x0000000000ac4352 in paddle::fluid::framework::OpKernelRegistrarFunctor<paddle::fluid::platform::CPUPlace, false, 0ul, paddle::fluid::framework::OpKernelTest<paddle::fluid::platform::CPUDeviceContext, float> >::operator() (
this=0x7ffffc057ff7, op_type=0xdc6f0c "op_with_kernel", accelerator=0xdc6f1b "CPU")
at /paddle/experimental/fluid/paddle/fluid/framework/op_registry.h:94
#4 0x0000000000ac159e in paddle::fluid::framework::OpKernelRegistrar<paddle::fluid::platform::CPUPlace, paddle::fluid::framework::OpKernelTest<paddle::fluid::platform::CPUDeviceContext, float> >::OpKernelRegistrar (
this=0x124c05c <__op_kernel_registrar_op_with_kernel_CPU__>, op_type=0xdc6f0c "op_with_kernel",
accelerator=0xdc6f1b "CPU") at /paddle/experimental/fluid/paddle/fluid/framework/op_registry.h:116
#5 0x0000000000ab8b11 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
at /paddle/experimental/fluid/paddle/fluid/framework/op_registry_test.cc:245
#6 0x0000000000ab8c72 in _GLOBAL__sub_I_op_registry_test.cc(void) ()
at /paddle/experimental/fluid/paddle/fluid/framework/op_registry_test.cc:247
#7 0x0000000000dc5f5d in __libc_csu_init ()
#8 0x00007f950f80b7bf in __libc_start_main (main=0xab42e0 <main>, argc=1, argv=0x7ffffc0581c8,
init=0xdc5f10 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffffc0581b8)
at ../csu/libc-start.c:247
#9 0x0000000000ab5f49 in _start ()