Bandwith test with eGPU (thunderbolt 3)
$ nvidia-smi
Thu Jun 8 23:49:42 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1070 Off| 00000000:01:00.0 On | N/A |
| N/A 71C P5 14W / N/A| 1255MiB / 8192MiB | 12% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 Off| 00000000:3D:00.0 Off | Off |
| 0% 41C P8 27W / 450W| 1MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1280 G /usr/bin/gnome-shell 473MiB |
| 0 N/A N/A 1794 G /usr/bin/Xwayland 11MiB |
| 0 N/A N/A 2417 G /usr/bin/telegram-desktop 1MiB |
| 0 N/A N/A 4380 G /usr/lib/firefox/firefox 478MiB |
| 0 N/A N/A 9649 G /usr/bin/nautilus 209MiB |
| 0 N/A N/A 17455 G /usr/bin/gnome-control-center 71MiB |
+---------------------------------------------------------------------------------------+
$ sudo boltctl list
● Cooler Master Technology,Inc MasterCase EG200
├─ type: peripheral
├─ name: MasterCase EG200
├─ vendor: Cooler Master Technology,Inc
├─ uuid: 00eaf0b0-222e-8302-ffff-ffffffffffff
├─ generation: Thunderbolt 3
├─ status: authorized
│ ├─ domain: e4030000-0090-8518-a3dc-9b04a244b21e
│ ├─ rx speed: 40 Gb/s = 2 lanes * 20 Gb/s
│ ├─ tx speed: 40 Gb/s = 2 lanes * 20 Gb/s
│ └─ authflags: none
├─ authorized: REDACTED
├─ connected: REDACTED
└─ stored: REDACTED
├─ policy: auto
└─ key: no
OS: Manjaro Linux x86_64
Host: P7xxTM
Kernel: 6.3.5-2-MANJARO
Shell: zsh 5.9
Resolution: 3840x2160
DE: GNOME 44.1
WM: Mutter
Terminal: gnome-terminal
CPU: Intel i5-8400 (6) @ 4.000GHz
GPU: NVIDIA GeForce GTX 1070 Mobile
GPU: NVIDIA GeForce RTX 4090
Memory: 15109MiB / 40057MiB
$ ./nvbandwidth
nvbandwidth Version: v0.2
Built from Git version: 42e94d2
NOTE: This tool reports current measured bandwidth on your system.
Additional system-specific tuning may be required to achieve maximal peak bandwidth.
CUDA Runtime Version: 12010
CUDA Driver Version: 12010
Driver Version: 530.41.03
Device 0: NVIDIA GeForce RTX 4090
Device 1: NVIDIA GeForce GTX 1070
Running host_to_device_memcpy_ce.
memcpy CE CPU(row) -> GPU(column) bandwidth (GB/s)
0 1
0 2.39 12.49
SUM host_to_device_memcpy_ce 14.87
Running device_to_host_memcpy_ce.
memcpy CE CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 2.83 13.11
SUM device_to_host_memcpy_ce 15.93
Running host_to_device_bidirectional_memcpy_ce.
memcpy CE CPU(row) <-> GPU(column) bandwidth (GB/s)
0 1
0 1.61 10.74
SUM host_to_device_bidirectional_memcpy_ce 12.35
Running device_to_host_bidirectional_memcpy_ce.
memcpy CE CPU(row) <-> GPU(column) bandwidth (GB/s)
0 1
0 1.63 11.08
SUM device_to_host_bidirectional_memcpy_ce 12.70
Waiving device_to_device_memcpy_read_ce.
Waiving device_to_device_memcpy_write_ce.
Waiving device_to_device_bidirectional_memcpy_read_ce.
Waiving device_to_device_bidirectional_memcpy_write_ce.
Running all_to_host_memcpy_ce.
memcpy CE CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 2.84 12.91
SUM all_to_host_memcpy_ce 15.74
Running all_to_host_bidirectional_memcpy_ce.
memcpy CE CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 1.63 10.19
SUM all_to_host_bidirectional_memcpy_ce 11.82
Running host_to_all_memcpy_ce.
memcpy CE CPU(row) -> GPU(column) bandwidth (GB/s)
0 1
0 2.39 12.46
SUM host_to_all_memcpy_ce 14.84
Running host_to_all_bidirectional_memcpy_ce.
memcpy CE CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 1.62 9.94
SUM host_to_all_bidirectional_memcpy_ce 11.56
Waiving all_to_one_write_ce.
Waiving all_to_one_read_ce.
Waiving one_to_all_write_ce.
Waiving one_to_all_read_ce.
Running host_to_device_memcpy_sm.
memcpy SM CPU(row) -> GPU(column) bandwidth (GB/s)
0 1
0 2.41 11.82
SUM host_to_device_memcpy_sm 14.23
Running device_to_host_memcpy_sm.
memcpy SM CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 2.85 13.11
SUM device_to_host_memcpy_sm 15.96
Waiving device_to_device_memcpy_read_sm.
Waiving device_to_device_memcpy_write_sm.
Waiving device_to_device_bidirectional_memcpy_read_sm.
Waiving device_to_device_bidirectional_memcpy_write_sm.
Running all_to_host_memcpy_sm.
memcpy SM CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 2.84 12.52
SUM all_to_host_memcpy_sm 15.36
Running all_to_host_bidirectional_memcpy_sm.
memcpy SM CPU(row) <- GPU(column) bandwidth (GB/s)
0 1
0 1.73 12.89
SUM all_to_host_bidirectional_memcpy_sm 14.62
Running host_to_all_memcpy_sm.
memcpy SM CPU(row) -> GPU(column) bandwidth (GB/s)
0 1
0 2.39 12.04
SUM host_to_all_memcpy_sm 14.43
Running host_to_all_bidirectional_memcpy_sm.
memcpy SM CPU(row) -> GPU(column) bandwidth (GB/s)
0 1
0 1.66 12.04
SUM host_to_all_bidirectional_memcpy_sm 13.69
Waiving all_to_one_write_sm.
Waiving all_to_one_read_sm.
Waiving one_to_all_write_sm.
Waiving one_to_all_read_sm.