Apply patch provided in this gist and build installer
# Assuming fedora, centos, rhel
# install golang and rr
sudo dnf install golang rr -y
# Install delve
go install github.com/go-delve/delve/cmd/dlv@latest
# Clone installer
git clone https://github.com/openshift/installer
# Apply changes to hack/build.sh and terraform/Makefile
git apply ../compile.patch
# Compile installer
./hack/build.sh
sudo dnf install rr -y
# OR
wget https://github.com/rr-debugger/rr/releases/download/5.6.0/rr-5.6.0-Linux-$(uname -m).rpm
sudo dnf install rr-5.6.0-Linux-$(uname -m).rpm
# required kernel settings
sudo sysctl kernel.perf_event_paranoid=-1
sudo sysctl kernel.kptr_restrict=0
# run openshift-install with rr record
rr record openshift-install create cluster --log-level debug
# rr pack gets trace ready for transport
rr pack
tar -cvzf trace.tar.gz ${HOME}/.local/share/rr/
# extract trace
tar -xvf trace.tar.gz
# Required kernel parameters
sudo sysctl kernel.perf_event_paranoid=-1
sudo sysctl kernel.kptr_restrict=0
# Run delve with rr backend
dlv replay --backend=rr --headless --listen=:2345 --api-version=2 --accept-multiclient ${HOME}/.local/share/rr/openshift-install-0
# debug from workstation
- CPU architectures must be similar
- If you are using vSphere virtualization you must enable Performance counters. In the guest configuration navigate to
vCPU
->Performance Counters
, checkEnable virtualized CPU performance counters
rr
saves the execution of the installer. All data structures including authentication and api objects are available to view. The transported trace should be encrypted.
- https://rr-project.org/
- https://github.com/rr-debugger/rr
- https://www.youtube.com/watch?v=sMnw28M-fMg
- https://devconfcz2020a.sched.com/event/YOsC/deterministic-debugging-with-delve
Old How-To
The machines that will run and/or debug the installer must have rr
installed.
If you are running fedora that can be done as easily as:
sudo dnf install golang rr -y
go install github.com/go-delve/delve/cmd/dlv@latest
Otherwise:
wget https://github.com/rr-debugger/rr/releases/download/5.6.0/rr-5.6.0-Linux-$(uname -m).rpm
sudo dnf install rr-5.6.0-Linux-$(uname -m).rpm
sudo sysctl kernel.perf_event_paranoid=-1
sudo sysctl kernel.kptr_restrict=0
If you are using vSphere virtualization you must enable Performance counters.
In the guest configuration navigate to vCPU
-> Performance Counters
, check Enable virtualized CPU performance counters
git clone https://github.com/openshift/installer
Open ./hack/build.sh
Comment out
LDFLAGS="${LDFLAGS} -s -w"
Add -gcflags "all=-N -l"
to go build
go build ${GOFLAGS} -gcflags "all=-N -l" -ldflags "${LDFLAGS}" -tags "${TAGS}" -o "${OUTPUT}" ./cmd/openshift-install
Rebuild the installer
./hack/build.sh
This doesn't change the terraform provider builds, that would need to be investigated.
rr record ../openshift-install create cluster --log-level debug
If the trace is to be used on an alternate machine:
rr pack
tar -cvzf trace.tar.gz ${HOME}/.local/share/rr/
The ./rr
directory will take significant amount of disk space if rr pack
is not ran.
In my limited experience it seems CPU models will matter so most likely you will be remotely executing delve.
Disable the firewall (yeah I am lazy)
sudo systemctl stop firewalld
sudo systemctl mask firewalld
Start delve in replay mode. Change the directory if the trace is not in the default folder.
dlv replay --backend=rr --headless --listen=:2345 --api-version=2 --accept-multiclient ${HOME}/.local/share/rr/openshift-install-0
Keep this window near by, the installer will still emit the typical output.
In goland, Add Configuration
-> Add New
-> Go Remote
. Change Host
to where delve is running.
Debug as normal...