Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active April 21, 2025 21:39
Show Gist options
  • Save scyto/67fdc9a517faefa68f730f82d7fa3570 to your computer and use it in GitHub Desktop.
Save scyto/67fdc9a517faefa68f730f82d7fa3570 to your computer and use it in GitHub Desktop.
Thunderbolt Networking Setup

Thunderbolt Networking

this gist is part of this series

you wil need proxmox kernel 6.2.16-14-pve or higher.

Load Kernel Modules

  • add thunderbolt and thunderbolt-net kernel modules (this must be done all nodes - yes i know it can sometimes work withoutm but the thuderbolt-net one has interesting behaviou' so do as i say - add both ;-)
    1. nano /etc/modules add modules at bottom of file, one on each line
    2. save using x then y then enter

Prepare /etc/network/interfaces

doing this means we don't have to give each thunderbolt a manual IPv6 addrees and that these addresses stay constant no matter what Add the following to each node using nano /etc/network/interfaces

If you see any sections called thunderbolt0 or thunderbol1 delete them at this point.

Create entries to prepopulate gui with reminder

Doing this means we don't have to give each thunderbolt a manual IPv6 or IPv4 addrees and that these addresses stay constant no matter what.

Add the following to each node using nano /etc/network/interfaces this to remind you not to edit en05 and en06 in the GUI

This fragment should go between the existing auto lo section and adapater sections.

iface en05 inet manual
#do not edit it GUI

iface en06 inet manual
#do not edit in GUI

If you see any thunderbol sections delete them from the file before you save it.

*DO NOT DELETE the source /etc/network/interfaces.d/* this will always exist on the latest versions and should be the last or next to last line in /interfaces file

Rename Thunderbolt Connections

This is needed as proxmox doesn't recognize the thunderbolt interface name. There are various methods to do this. This method was selected after trial and error because:

  • the thunderboltX naming is not fixed to a port (it seems to be based on sequence you plug the cables in)
  • the MAC address of the interfaces changes with most cable insertion and removale events
  1. use udevadm monitor command to find your device IDs when you insert and remove each TB4 cable. Yes you can use other ways to do this, i recommend this one as it is great way to understand what udev does - the command proved more useful to me than the syslog or lspci command for troublehsooting thunderbolt issues and behavious. In my case my two pci paths are 0000:00:0d.2and 0000:00:0d.3 if you bought the same hardware this will be the same on all 3 units. Don't assume your PCI device paths will be the same as mine.

  2. create a link file using nano /etc/systemd/network/00-thunderbolt0.link and enter the following content:

[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05
  1. create a second link file using nano /etc/systemd/network/00-thunderbolt1.link and enter the following content:
[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06

Set Interfaces to UP on reboots and cable insertions

This section en sure that the interfaces will be brought up at boot or cable insertion with whatever settings are in /etc/network/interfaces - this shouldn't need to be done, it seems like a bug in the way thunderbolt networking is handled (i assume this is debian wide but haven't checked).

Huge thanks to @corvy for figuring out a script that should make this much much more reliable for most

  1. create a udev rule to detect for cable insertion using nano /etc/udev/rules.d/10-tb-en.rules with the following content:
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en05", RUN+="/usr/local/bin/pve-en05.sh"
ACTION=="move", SUBSYSTEM=="net", KERNEL=="en06", RUN+="/usr/local/bin/pve-en06.sh"
  1. save the file

  2. create the first script referenced above using nano /usr/local/bin/pve-en05.sh and with the follwing content:

#!/bin/bash

LOGFILE="/tmp/udev-debug.log"
VERBOSE="" # Set this to "-v" for verbose logging
IF="en05"

echo "$(date): pve-$IF.sh triggered by udev" >> "$LOGFILE"

# If multiple interfaces go up at the same time, 
# retry 10 times and break the retry when successful
for i in {1..10}; do
    echo "$(date): Attempt $i to bring up $IF" >> "$LOGFILE"
    /usr/sbin/ifup $VERBOSE $IF >> "$LOGFILE" 2>&1 && {
        echo "$(date): Successfully brought up $IF on attempt $i" >> "$LOGFILE"
        break
    }
  
    echo "$(date): Attempt $i failed, retrying in 3 seconds..." >> "$LOGFILE"
    sleep 3
done

save the file and then

  1. create the second script referenced above using nano /usr/local/bin/pve-en06.sh and with the follwing content:
#!/bin/bash

LOGFILE="/tmp/udev-debug.log"
VERBOSE="" # Set this to "-v" for verbose logging
IF="en06"

echo "$(date): pve-$IF.sh triggered by udev" >> "$LOGFILE"

# If multiple interfaces go up at the same time, 
# retry 10 times and break the retry when successful
for i in {1..10}; do
    echo "$(date): Attempt $i to bring up $IF" >> "$LOGFILE"
    /usr/sbin/ifup $VERBOSE $IF >> "$LOGFILE" 2>&1 && {
        echo "$(date): Successfully brought up $IF on attempt $i" >> "$LOGFILE"
        break
    }
  
    echo "$(date): Attempt $i failed, retrying in 3 seconds..." >> "$LOGFILE"
    sleep 3
done

and save the file

  1. make both scripts executable with chmod +x /usr/local/bin/*.sh
  2. run update-initramfs -u -k all to propogate the new link files into initramfs
  3. Reboot (restarting networking, init 1 and init 3 are not good enough, so reboot)

Enabling IP Connectivity

proceed to the next gist

Slow Thunderbolt Performance? Too Many Retries? No traffic? Try this!

verify neighbors can see each other (connectivity troubleshooting)

##3 Install LLDP - this is great to see what nodes can see which.

  • install lldpctl with apt install lldpd on all 3 nodes
  • execute lldpctl you should info

make sure iommu is enabled (speed troubleshooting)

if you are having speed issues make sure the following is set on the kernel command line in /etc/default/grub file intel_iommu=on iommu=pt one set be sure to run update-grub and reboot

everyones grub command line is different this is mine because i also have i915 virtualization, if you get this wrong you can break your machine, if you are not doing that you don't need the i915 entries you see below

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt" (note if you have more things in your cmd line DO NOT REMOVE them, just add the two intel ones, doesnt matter where.

Pinning the Thunderbolt Driver (speed and retries troubleshooting)

identify you P and E cores by running the following

cat /sys/devices/cpu_core/cpus && cat /sys/devices/cpu_atom/cpus

you should get two lines on an intel system with P and E cores. first line should be your P cores second line should be your E cores

for example on mine:

root@pve1:/etc/pve# cat /sys/devices/cpu_core/cpus && cat /sys/devices/cpu_atom/cpus
0-7
8-15

create a script to apply affinity settings everytime a thunderbolt interface comes up

  1. make a file at /etc/network/if-up.d/thunderbolt-affinity
  2. add the following to it - make sure to replace echo X-Y with whatever the report told you were your performance cores - e.g. echo 0-7
#!/bin/bash

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
# Set Thunderbot affinity to Pcores
    grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo X-Y | tee "/proc/irq/{}/smp_affinity_list"'
fi
  1. save the file - done

Extra Debugging for Thunderbolt

dynamic kernel tracing - adds more info to dmesg, doesn't overhwelm dmesg

I have only tried this on 6.8 kernels, so YMMV If you want more TB messages in dmesg to see why connection might be failing here is how to turn on dynamic tracing

For bootime you will need to add it to the kernel command line by adding thunderbolt.dyndbg=+p to your /etc/default/grub file, running update-grub and rebooting.

To expand the example above"

`GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt thunderbolt.dyndbg=+p"`  

Don't forget to run update-grub after saving the change to the grub file.

For runtime debug you can run the following command (it will revert on next boot) so this cant be used to cpature what happens at boot time.

`echo -n 'module thunderbolt =p' > /sys/kernel/debug/dynamic_debug/control`

install tbtools

these tools can be used to inspect your thundebolt system, note they rely on rust to be installedm you must use the rustup script below and not intsall rust by package manager at this time (9/15/24)

apt install pkg-config libudev-dev git curl
curl https://sh.rustup.rs -sSf | sh
git clone https://github.com/intel/tbtools
restart you ssh session
cd tbtools
cargo install --path .
@corvy
Copy link

corvy commented Apr 3, 2025

Same experience here. Updated this morning, N to overwrite and all works swell.

@scyto
Copy link
Author

scyto commented Apr 14, 2025

Same experience here. Updated this morning, N to overwrite and all works swell.

same here so long as N or D is selected there should be no issue
next up is me moving to later kernel, i know some have had issues with that.... if those are still happening I am keen to get them logged with proxmox folks as regressions

@JahMark420
Copy link

JahMark420 commented Apr 19, 2025

Seems I have this working now, but running into a strange issue - I have 2 MS-01, one is a 13900H and the other is a 12600H
When I have 13900H (10.0.0.81) set up as the server running iperf3, I get lower speeds and a lot of retries:

Connecting to host 10.0.0.81, port 5201
[ 5] local 10.0.0.82 port 34714 connected to 10.0.0.81 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 2.06 GBytes 17.7 Gbits/sec 595 1.75 MBytes
[ 5] 1.00-2.00 sec 1.11 GBytes 9.56 Gbits/sec 324 1.69 MBytes
[ 5] 2.00-3.00 sec 1.92 GBytes 16.5 Gbits/sec 692 1.56 MBytes
[ 5] 3.00-4.00 sec 1.50 GBytes 12.9 Gbits/sec 499 1.50 MBytes
[ 5] 4.00-5.00 sec 1.54 GBytes 13.3 Gbits/sec 475 1.50 MBytes
[ 5] 5.00-6.00 sec 1001 MBytes 8.40 Gbits/sec 299 1.44 MBytes
[ 5] 6.00-7.00 sec 2.04 GBytes 17.5 Gbits/sec 679 1.62 MBytes
[ 5] 7.00-8.00 sec 2.18 GBytes 18.7 Gbits/sec 781 1.81 MBytes
[ 5] 8.00-9.00 sec 1.87 GBytes 16.1 Gbits/sec 604 1.37 MBytes
[ 5] 9.00-10.00 sec 2.05 GBytes 17.6 Gbits/sec 678 2.31 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 17.2 GBytes 14.8 Gbits/sec 5626 sender
[ 5] 0.00-10.00 sec 17.2 GBytes 14.8 Gbits/sec receiver

When I set 12600H (10.0.0.82) set up as server I get the correct speeds:

Connecting to host 10.0.0.82, port 5201
[ 5] local 10.0.0.81 port 49150 connected to 10.0.0.82 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 3.07 GBytes 26.4 Gbits/sec 35 3.43 MBytes
[ 5] 1.00-2.00 sec 3.07 GBytes 26.4 Gbits/sec 1 3.43 MBytes
[ 5] 2.00-3.00 sec 2.99 GBytes 25.7 Gbits/sec 5 3.50 MBytes
[ 5] 3.00-4.00 sec 3.05 GBytes 26.2 Gbits/sec 5 3.50 MBytes
[ 5] 4.00-5.00 sec 3.06 GBytes 26.3 Gbits/sec 0 3.50 MBytes
[ 5] 5.00-6.00 sec 3.08 GBytes 26.4 Gbits/sec 2 3.50 MBytes
[ 5] 6.00-7.00 sec 3.07 GBytes 26.3 Gbits/sec 0 3.50 MBytes
[ 5] 7.00-8.00 sec 3.06 GBytes 26.3 Gbits/sec 2 3.50 MBytes
[ 5] 8.00-9.00 sec 3.08 GBytes 26.5 Gbits/sec 2 3.50 MBytes
[ 5] 9.00-10.00 sec 3.05 GBytes 26.2 Gbits/sec 0 3.50 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 30.6 GBytes 26.3 Gbits/sec 52 sender
[ 5] 0.00-10.00 sec 30.6 GBytes 26.3 Gbits/sec receiver

I also get similar results with ipv6.

I've tried updating the grub file,
ran update-initramfs -u -k all and did a reboot
Other than the CPU, both systems are identical, I believe they both have the same USB4 as well.

Both are running the same kernel
Kernel Version Linux 6.8.12-9-pve (2025-03-16T19:18Z)

I thought it was a kernel issue, but once I ran the update-initramfs -u -k all on them, speeds from one side worked as expected.

Has anyone run into this issue? Or am I over my headand I'm limited to what the 12600H can handle? I suspect not, since they have the same USB4

Im very much a noob when it comes to linux but have been finding my way around
Any help is appreciated :)

@nickglott
Copy link

nickglott commented Apr 19, 2025

@JahMark420 Make sure you have IOMMU on each node, that tends to be the biggest issue with speed. The other is pinning TB to only P-cores.

make sure the following is set on the kernel command line in /etc/default/grub file intel_iommu=on iommu=pt one set be sure to run update-grub and reboot

everyones grub command line is different this is mine because i also have i915 virtualization, if you get this wrong you can break your machine, if you are not doing that you don't need the i915 entries you see below

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt" (note if you have more things in your cmd line DO NOT REMOVE them, just add the two intel ones, doesnt matter where.

I am using this below, it is acctivated every ifup. /etc/network/if-up.d/thunderbolt-affinity (changing 0-7) for what your P-cores are. There is also another meathod listed above somewhere in the comments in one of these gists @Allistah is the one that posted ther other method that I don't think you need to define the P-cores

#!/bin/bash

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
# Set Thunderbot affinity to Pcores
    grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-7 | tee "/proc/irq/{}/smp_affinity_list"'
fi

@JahMark420
Copy link

@JahMark420 Make sure you have IOMMU on each node, that tends to be the biest issue with speed. The other is pinning TB to only P-cores.

make sure the following is set on the kernel command line in /etc/default/grub file intel_iommu=on iommu=pt one set be sure to run update-grub and reboot

everyones grub command line is different this is mine because i also have i915 virtualization, if you get this wrong you can break your machine, if you are not doing that you don't need the i915 entries you see below

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt" (note if you have more things in your cmd line DO NOT REMOVE them, just add the two intel ones, doesnt matter where.

I am using this below, it is acctivated every ifup. /etc/network/if-up.d/thunderbolt-affinity (changing 0-7) for what your P-cores are. There is also another meathod listed above somewhere in the comments in one of these gists @Allistah is the one that posted ther other method that I don't think you need to define the P-cores

#!/bin/bash

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
# Set Thunderbot affinity to Pcores
    grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-7 | tee "/proc/irq/{}/smp_affinity_list"'
fi

Thanks @nickglott - Yes both nodes have the IOMMU - I followed that part and did the update-grub and rebooted

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=lsb_release -i -s 2> /dev/null || echo Debian
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
#GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""

I didn't think id need to define the P Cores since it was working one way but you could be right.

The grub looks correct or am I missing something?

Thanks for the swift reply :)

@nickglott
Copy link

@JahMark420 The other method by @Allistah or @contributorr mentioned here....it has been awhile and I can't remmeber sorry.

https://gist.github.com/scyto/4c664734535da122f4ab2951b22b2085?permalink_comment_id=5248976#gistcomment-5248976

add this to /etc/rc.local

#!/bin/bash
for id in $(grep 'thunderbolt' /proc/interrupts | awk '{print $1}' | cut -d ':' -f1); do
    echo 0f > /proc/irq/$id/smp_affinity
done

@nickglott
Copy link

For some raeson some people need it and other don't I know most of us with MS-01's need it, Scyto's nuc's dont prob something to do with bios or firmware or the way TB is implamented.

I didn't think id need to define the P Cores since it was working one way but you could be right.

The grub looks correct or am I missing something?

Thanks for the swift reply :)

@JahMark420
Copy link

For some raeson some people need it and other don't I know most of us with MS-01's need it, Scyto's nuc's dont prob something to do with bios or firmware or the way TB is implamented.

I didn't think id need to define the P Cores since it was working one way but you could be right.
The grub looks correct or am I missing something?
Thanks for the swift reply :)

Thanks @nickglott seems specifying the P cores did the trick. Seeing now about 26Gbps across both nodes.

Appreciate the guidance

@scyto
Copy link
Author

scyto commented Apr 21, 2025

@nickglott @JahMark420 does the affinity script need running every time the interfaces come up or is running in /etc/rc.local good enough?

(i don't have this issue so can't verify)

I hope the latter, just added to the main gist above along with other changes in both this and the openfrabric IP side of things too

@corvy
Copy link

corvy commented Apr 21, 2025

I do not run this in rc.local. I have it like this:

root@px0# cat /etc/network/if-up.d/thunderbolt-affinity 
#!/bin/bash

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
# Set Thunderbot affinity to Pcores
    grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-11 | tee "/proc/irq/{}/smp_affinity_list"'
fi

This ensure it gets set everytime the if en05 or en06 goes up or down. Including cable connect / disconnect. I prefer this over rc.local. Should the device change IRQ then the rc.local approach will fail. Not sure who suggested this approach, maybe it was @nickglott but I cannot remember. At least doing it this way is very robust and would be my suggestion.

@scyto
Copy link
Author

scyto commented Apr 21, 2025

This ensure it gets set everytime the if en05 or en06 goes up or down. Including cable connect / disconnect. I prefer this over rc.local. Should the device change IRQ then the rc.local approach will fail. Not sure who suggested this approach, maybe it was @nickglott but I cannot remember. At least doing it this way is very robust and would be my suggestion.

Thanks, i was also contemplating telling folks to add it to the user crontab using the crontab -e command with at @daily but if this needs to be done each time the driver is loaded thats also a bust too. I agree you way looks robust - which i think is key.

its also wild to me i just don't get the issue... this is between two of my two nodes, i have never set affinity, i would love to understand why the difference occurs....

Connecting to host fc00::81, port 5201
[  5] local fc00::82 port 38314 connected to fc00::81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.05 GBytes  26.2 Gbits/sec   28   3.06 MBytes       
[  5]   1.00-2.00   sec  3.12 GBytes  26.8 Gbits/sec    3   2.81 MBytes       
[  5]   2.00-3.00   sec  3.09 GBytes  26.6 Gbits/sec   31   3.87 MBytes       
[  5]   3.00-4.00   sec  3.12 GBytes  26.8 Gbits/sec    0   3.87 MBytes       
[  5]   4.00-5.00   sec  3.12 GBytes  26.8 Gbits/sec    8   2.81 MBytes       
[  5]   5.00-6.00   sec  3.10 GBytes  26.7 Gbits/sec    1   3.81 MBytes       
[  5]   6.00-7.00   sec  3.11 GBytes  26.7 Gbits/sec    0   3.81 MBytes       
[  5]   7.00-8.00   sec  3.11 GBytes  26.7 Gbits/sec    0   3.81 MBytes       
[  5]   8.00-9.00   sec  3.09 GBytes  26.6 Gbits/sec    0   3.81 MBytes       
[  5]   9.00-10.00  sec  3.10 GBytes  26.6 Gbits/sec    1   3.81 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec   72             sender
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec                  receiver

out of interest what is you smp_affinity setting, is it ffff?

root@pve2:~# cat /proc/irq/129/smp_affinity
ffff
root@pve2:~# cat /proc/irq/129/smp_affinity_list
0-15

@nickglott
Copy link

@corvy @scyto It was brought up in the Proxmox forum so I don't wnat to take credit. I belive it was @Allistah that recomended it in rc.local so it wanst ran as much. I still have mine to run in if-up as I wanted to make sure it applyed after any change like cory stated. Mine has been rock solid for months.

@scyto
Copy link
Author

scyto commented Apr 21, 2025

interesingly (or maybe false positive). chaging my performance governor from power save to performance reuced the number of retries....

root@pve2:~# iperf3 -c fc00::81
Connecting to host fc00::81, port 5201
[  5] local fc00::82 port 59904 connected to fc00::81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.09 GBytes  26.5 Gbits/sec    1   3.00 MBytes       
[  5]   1.00-2.00   sec  3.10 GBytes  26.6 Gbits/sec    1   3.00 MBytes       
[  5]   2.00-3.00   sec  3.10 GBytes  26.6 Gbits/sec    1   3.00 MBytes       
[  5]   3.00-4.00   sec  3.10 GBytes  26.6 Gbits/sec    0   3.00 MBytes       
[  5]   4.00-5.00   sec  3.09 GBytes  26.6 Gbits/sec    0   3.00 MBytes       
[  5]   5.00-6.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   6.00-7.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   7.00-8.00   sec  3.09 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   8.00-9.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   9.00-10.00  sec  3.07 GBytes  26.4 Gbits/sec    0   3.00 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec    3             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec                  receiver

@pSyCr0
Copy link

pSyCr0 commented Apr 21, 2025

interesingly (or maybe false positive). chaging my performance governor from power save to performance reuced the number of retries....

root@pve2:~# iperf3 -c fc00::81
Connecting to host fc00::81, port 5201
[  5] local fc00::82 port 59904 connected to fc00::81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.09 GBytes  26.5 Gbits/sec    1   3.00 MBytes       
[  5]   1.00-2.00   sec  3.10 GBytes  26.6 Gbits/sec    1   3.00 MBytes       
[  5]   2.00-3.00   sec  3.10 GBytes  26.6 Gbits/sec    1   3.00 MBytes       
[  5]   3.00-4.00   sec  3.10 GBytes  26.6 Gbits/sec    0   3.00 MBytes       
[  5]   4.00-5.00   sec  3.09 GBytes  26.6 Gbits/sec    0   3.00 MBytes       
[  5]   5.00-6.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   6.00-7.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   7.00-8.00   sec  3.09 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   8.00-9.00   sec  3.08 GBytes  26.5 Gbits/sec    0   3.00 MBytes       
[  5]   9.00-10.00  sec  3.07 GBytes  26.4 Gbits/sec    0   3.00 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec    3             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec                  receiver

I have tried this ony my 3 node nuc13 cluster and don't have these results nor in performance or power save. It always with 24-64 retrs.

@scyto
Copy link
Author

scyto commented Apr 21, 2025

What do you think the easiest way is to have poeple identify their cores?

chatgpt helped me come with this... i am sure it is over engineered, lol.

root@pve1:/etc/cron.d# echo "CPU MAXMHZ TYPE"; lscpu -e=CPU,MAXMHZ | awk 'NR>1{mhz[$1]=$2; if($2>max) max=$2} END{for(c in mhz){type=(mhz[c] >= max - 100) ? "Performance" : "Efficient"; printf "%s %.3f %s\n", c, mhz[c], type}}' | sort -n

it produces:

CPU MAXMHZ TYPE
0 5000.000 Performance
1 5000.000 Performance
2 5000.000 Performance
3 5000.000 Performance
4 5000.000 Performance
5 5000.000 Performance
6 5000.000 Performance
7 5000.000 Performance
8 3700.000 Efficient
9 3700.000 Efficient
10 3700.000 Efficient
11 3700.000 Efficient
12 3700.000 Efficient
13 3700.000 Efficient
14 3700.000 Efficient
15 3700.000 Efficient

@scyto
Copy link
Author

scyto commented Apr 21, 2025

I have tried this ony my 3 node nuc13 cluster and don't have these results nor in performance or power save. It always with 24-64 retrs.

The server is on 6.14.0-2-pve (client is on 6.8.12-9-pve - i test opt-in kernels for a week or two before i roll them out across all nodes!)
iperf3 is iperf 3.12 (cJSON 1.7.15) on both nodes
i use very short OWC TB4 certified cables

beyond that it must come down to the H/W implementation - speed is controlled by the DMA controller on the device i guess the TB firmware version might matter too....

@pSyCr0
Copy link

pSyCr0 commented Apr 21, 2025

What do you think the easiest way is to have poeple identify their cores?

chatgpt helped me come with this... i am sure it is over engineered, lol.

root@pve1:/etc/cron.d# echo "CPU MAXMHZ TYPE"; lscpu -e=CPU,MAXMHZ | awk 'NR>1{mhz[$1]=$2; if($2>max) max=$2} END{for(c in mhz){type=(mhz[c] >= max - 100) ? "Performance" : "Efficient"; printf "%s %.3f %s\n", c, mhz[c], type}}' | sort -n

it produces:

CPU MAXMHZ TYPE
0 5000.000 Performance
1 5000.000 Performance
2 5000.000 Performance
3 5000.000 Performance
4 5000.000 Performance
5 5000.000 Performance
6 5000.000 Performance
7 5000.000 Performance
8 3700.000 Efficient
9 3700.000 Efficient
10 3700.000 Efficient
11 3700.000 Efficient
12 3700.000 Efficient
13 3700.000 Efficient
14 3700.000 Efficient
15 3700.000 Efficient

Myoutput is this ( i5-1340P):

root@pve2:~# echo "CPU MAXMHZ TYPE"; lscpu -e=CPU,MAXMHZ | awk 'NR>1{mhz[$1]=$2; if($2>max) max=$2} END{for(c in mhz){type=(mhz[c] >= max - 100) ? "Performance" : "Efficient"; printf "%s %.3f %s\n", c, mhz[c], type}}' | sort -n
CPU MAXMHZ TYPE
0 4600.000 Performance
1 4600.000 Performance
2 4600.000 Performance
3 4600.000 Performance
4 4600.000 Performance
5 4600.000 Performance
6 4600.000 Performance
7 4600.000 Performance
8 3400.000 Efficient
9 3400.000 Efficient
10 3400.000 Efficient
11 3400.000 Efficient
12 3400.000 Efficient
13 3400.000 Efficient
14 3400.000 Efficient
15 3400.000 Efficient

@scyto
Copy link
Author

scyto commented Apr 21, 2025

if somone doesn't give me a better way to do it, looks like this command will work .... thoughts?

@JahMark420
Copy link

I do not run this in rc.local. I have it like this:

root@px0# cat /etc/network/if-up.d/thunderbolt-affinity 
#!/bin/bash

# Check if the interface is either en05 or en06
if [ "$IFACE" = "en05" ] || [ "$IFACE" = "en06" ]; then
# Set Thunderbot affinity to Pcores
    grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-11 | tee "/proc/irq/{}/smp_affinity_list"'
fi

This ensure it gets set everytime the if en05 or en06 goes up or down. Including cable connect / disconnect. I prefer this over rc.local. Should the device change IRQ then the rc.local approach will fail. Not sure who suggested this approach, maybe it was @nickglott but I cannot remember. At least doing it this way is very robust and would be my suggestion.

@scyto - This is how I have it set as well - For some reason, it will not work if I use the /etc/rc.local route

@scyto
Copy link
Author

scyto commented Apr 21, 2025

ok after pinning the driver i can now get 2 out of 3 runs with 0 errors.... (the other run had 4 errors) though i note the absolute amount transfered is slightlu lowe (

root@pve2:~# iperf3 -c fc00::81
Connecting to host fc00::81, port 5201
[  5] local fc00::82 port 37680 connected to fc00::81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   1.00-2.00   sec  3.08 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   2.00-3.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   3.00-4.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   4.00-5.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.87 MBytes       
[  5]   5.00-6.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   6.00-7.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.87 MBytes       
[  5]   7.00-8.00   sec  2.98 GBytes  25.6 Gbits/sec    0   2.87 MBytes       
[  5]   8.00-9.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.87 MBytes       
[  5]   9.00-10.00  sec  3.05 GBytes  26.2 Gbits/sec    0   2.87 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.6 GBytes  26.3 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.6 GBytes  26.3 Gbits/sec                  receiver

technically a little bit slower than when i had no affinity set......

@JahMark420
Copy link


out of interest what is you smp_affinity setting, is it ffff?

root@pve2:# cat /proc/irq/129/smp_affinity
ffff
root@pve2:
# cat /proc/irq/129/smp_affinity_list
0-15

Got the same output except for my 1st box, which has it at the list set at 0-19 due to different cores on that one.

@scyto
Copy link
Author

scyto commented Apr 21, 2025

Got the same output except for my 1st box, which has it at the list set at 0-19 due to different cores on that one.

thanks so its not whatever the ffff setting does then :-) this issue must be down to H/W differences and how the scheduler works, oh well

@JahMark420
Copy link

Got the same output except for my 1st box, which has it at the list set at 0-19 due to different cores on that one.

thanks so it's not whatever the ffff setting does then :-) This issue must be down to H/W differences and how the scheduler works, oh well

Yeah, seems so - H/W differences and how the system handles P and E cores now

@nickglott
Copy link

nickglott commented Apr 21, 2025

I5-12600H MS-01's, works for mine

CPU MAXMHZ TYPE
0 4500.000 Performance
1 4500.000 Performance
2 4500.000 Performance
3 4500.000 Performance
4 4500.000 Performance
5 4500.000 Performance
6 4500.000 Performance
7 4500.000 Performance
8 3300.000 Efficient
9 3300.000 Efficient
10 3300.000 Efficient
11 3300.000 Efficient
12 3300.000 Efficient
13 3300.000 Efficient
14 3300.000 Efficient
15 3300.000 Efficient

@JahMark420
Copy link

on the MS-01 13900H, they show up as:

CPU MAXMHZ TYPE
0 5200.000 Efficient
1 5200.000 Efficient
2 5200.000 Efficient
3 5200.000 Efficient
4 5400.000 Performance
5 5400.000 Performance
6 5400.000 Performance
7 5400.000 Performance
8 5200.000 Efficient
9 5200.000 Efficient
10 5200.000 Efficient
11 5200.000 Efficient
12 4100.000 Efficient
13 4100.000 Efficient
14 4100.000 Efficient
15 4100.000 Efficient
16 4100.000 Efficient
17 4100.000 Efficient
18 4100.000 Efficient
19 4100.000 Efficient

@nickglott
Copy link

nickglott commented Apr 21, 2025

@JahMark420 How about if you do this?
P-cores would be this
cat /sys/devices/cpu_core/cpus

E-cores would be this
cat /sys/devices/cpu_atom/cpus

@JahMark420
Copy link

JahMark420 commented Apr 21, 2025

cat /sys/devices/cpu_atom/cpus

That worked nicely Thanks @nickglott

cat /sys/devices/cpu_core/cpus
0-11
cat /sys/devices/cpu_atom/cpus
12-19

@nickglott
Copy link

cat /sys/devices/cpu_atom/cpus

That worked nicely Thanks @nickglott

cat /sys/devices/cpu_core/cpus 0-11 cat /sys/devices/cpu_atom/cpus 12-19

@scyto Maybe a better route to go?

@scyto
Copy link
Author

scyto commented Apr 21, 2025

maybe, i see why the chatgpt command failed - its using a difference between the MAXMHZ to deterine P vs E

to check what CPU is it that has two different speeds for the performance cores?
is there any type where the efficieny cores wont show as atom and are lumped into /cpu_core/ ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment