Skip to content

Instantly share code, notes, and snippets.

@korjaa
Last active July 22, 2022 19:28
Show Gist options
  • Save korjaa/d7734d11641b3339415f598fb6283283 to your computer and use it in GitHub Desktop.
Save korjaa/d7734d11641b3339415f598fb6283283 to your computer and use it in GitHub Desktop.
Notes for Ubuntu 20.04 external GPU setup

grub bootloader changes

Two changes were required for GRUB bootloader

Delay

Adding 15 second boot menu delay to grub config solves thunderbolt boot failure.

Delay is added to /etc/default/grub

$ cat /etc/default/grub
...
GRUB_TIMEOUT_STYLE=menu
GRUB_TIMEOUT=15
...
$ sudo update-grub

Not an ideal solution, but it seems to workaround some other issue.

The failure looks something like this (hand written from screenshot)

Workqueue: thunderbolt0 icm_handle_notification [thunderbolt]
RIP: 0010: switch_find_xdomain+0x11/0x160 [thunderbolt]
Code: aa bb cc dd ee
RSP: ... EFLAGS...
RAX ... RBX ... RCX ...
RDX ... RSI ... RDI ...
RBP ... R08 ... R09 ...
...
Call Trace:
 <TASK>
 tb_xdomain_find_by_link_depth
 icm_fr_device_connected
 icm_handle_notification
 process_one_work
 worker_thread
 ? process_one_work
 kthread
 ? set_kthread_struct
 reg_from_fork
 </TASK>

Kernel boot argument

At some point I added pci=realloc to kernel cmdline configuration.

$ cat /etc/default/grub
...
#GRUB_CMDLINE_LINUX_DEFAULT="quiet splash acpi=off"
GRUB_CMDLINE_LINUX_DEFAULT="pci=realloc"
...

This script is a systemd service that checks for existence of eGPU before starting Xorg. If eGPU is found, /etc/X11/xorg.conf is linked to appropriate device configuration. This is used to only define a one display device for Xorg. This improves performance, otherwise it seems that xorg wants to use 10-50% CPU while just idling with firefox.

egpu-switcher didn't work out of the box with my spectre machine. When inspected the /etc/X11/xorg.conf.internal was empty. Modifying the file to following contents helped boot:

$ cat /etc/X11/xorg.conf.internal 
Section "Device"
  Identifier "Intel Graphics"
  Driver "intel"
  Option "TearFree" "true"
EndSection

Above content was originally at /etc/X11/xorg.conf.d/20-intel.conf. Not sure where it got there. Now the file is deleted and the configuration only remains in /etc/X11/xorg.conf.internal.

Check egpu-switcher logs with

$ journalctl -u egpu.service
-- Reboot --
Jul 22 20:35:32 spectre systemd[1]: Starting EGPU Service...
Jul 22 20:35:32 spectre egpu-switcher[937]: [info] Automatically detecting if egpu is con>
Jul 22 20:35:34 spectre egpu-switcher[937]: [info] EGPU is disconnected.
Jul 22 20:35:34 spectre egpu-switcher[937]: [info] Create symlink /etc/X11/xorg.conf -> />
Jul 22 20:35:34 spectre systemd[1]: egpu.service: Succeeded.
Jul 22 20:35:34 spectre systemd[1]: Finished EGPU Service.
-- Reboot --
Jul 22 20:41:12 spectre systemd[1]: Starting EGPU Service...
Jul 22 20:41:12 spectre egpu-switcher[1100]: [info] Automatically detecting if egpu is co>
Jul 22 20:41:13 spectre egpu-switcher[1100]: [info] EGPU is connected.
Jul 22 20:41:13 spectre egpu-switcher[1100]: [info] Create symlink /etc/X11/xorg.conf -> >
Jul 22 20:41:13 spectre systemd[1]: egpu.service: Succeeded.
Jul 22 20:41:13 spectre systemd[1]: Finished EGPU Service.

nvidia module is blacklisted

After thunderbolt and PCI devices enumerate correctly but nvidia-smi still fails, the issue could be that the module is blacklisted with prime-select intel or prime-select on-demand. The blacklisting configuration exists in automatically generated file /lib/modprobe.d/blacklist-nvidia.conf.

See blacklisting from gpu-manager log

$ cat /var/log/gpu-manager.log
...
Is nvidia blacklisted? yes
...

The blacklisting can be removed with

prime-select nvidia

DKMS module rebuild

Not 100% sure, but it might be that the dkms modules get deleted and don't get recompiled with multiple sudo apt reinstall nvidia-driver-515 and sudo apt purge nvidia-*.

nvidia dkms binaries can be recompiled again with dpkg-reconfgure:

sudo dpkg-reconfigure nvidia-dkms-515

Kernel 5.18 & "ibt=off"

Possible issue in future Kernel. NVIDIA/open-gpu-kernel-modules#256 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment