Non-Uniform Memory Access (NUMA) is one of the computer memory design methods used in multiprocessor systems, and the time to access the memory varies depending on the relative position between the memory and the processor. In the NUMA architecture, when a processor accesses its local memory, it is faster than when it accesses the remote memory. Remote memory refers to memory that is connected to another processor, and local memory refers to memory that is connected to its own processor. In other words, it is a technology to increase memory access efficiency while using multiple processors on one motherboard. When a specific processor runs out of memory, it monopolizes the bus by itself, so other processors have to play. , and designate 'access only here', and call it a NUMA node.
lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 12GB] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
The first line shows the address of the VGA compatible device, NVIDIA Geforce, as 01:00 . Each one will be different, so let's change this part carefully.
If you go to /sys/bus/pci/devicecs/
, you can see the following list:
ls /sys/bus/pci/devices/
0000:00:00.0 0000:00:06.0 0000:00:15.0 0000:00:1c.0 0000:00:1f.3 0000:00:1f.6 0000:02:00.0
0000:00:01.0 0000:00:14.0 0000:00:16.0 0000:00:1d.0 0000:00:1f.4 0000:01:00.0
0000:00:02.0 0000:00:14.2 0000:00:17.0 0000:00:1f.0 0000:00:1f.5 0000:01:00.1
01:00.0 checked above is visible. However, 0000: is attached in front.
cat /sys/bus/pci/devices/0000\:01\:00.0/numa_node
-1
-1 means no connection, 0 means connected.
sudo echo 0 | sudo tee -a /sys/bus/pci/devices/0000\:01\:00.0/numa_node
0
It shows 0 which means connected!
cat /sys/bus/pci/devices/0000\:01\:00.0/numa_node
0
That's it!
It helps just one time, i.e., show o once upon making the change, however, it is not a persistent solution. After rebooting, it still shows "successful NUMA node read from SysFS had negative value (-1),..."
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2024-05-03 17:14:58.945528: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-03 17:15:00.791206: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-05-03 17:15:02.618108: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-03 17:15:03.056970: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-03 17:15:03.057654: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
After checking it again, it shows the same error as follows.
cat /sys/bus/pci/devices/0000\:01\:00.0/numa_node
-1
Please help to solve the problem.
Thanks