Apparently Linux power management can be fairly aggressive and CPU downclocking can have serious effect on PyTorch jobs (even those running on CUDA).
To check the current mode of CPU 0 run the snippet below. On my machine it shows that all cores are in powersave mode.
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
[ -f $CPUFREQ ] || continue;
echo $CPUFREQ $(cat $CPUFREQ)
doneTo disable power saving run the following:
sudo cpupower frequency-set -g performanceThis speeds up word_language_model PyTorch example (with --cuda flag) 2 times (116ms -> 57ms per batch). I'd expect the largest speedups on models that have a lot of small kernels that can't saturate the GPU.
Note that this might increase the power consumption significantly, so you might want to revert the setting to powersave once you finish training (idle clock speeds increase from 1.2GHz to 2.4GHz on my machine after changing the mode).
Reference:
- Avoiding CPU Speed Scaling – Running CPU At Full Speed
- ArchLinux wiki
- Kernel docs - CPU governors
- Kernel docs - Intel P-State driver
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
[ -f $CPUFREQ ] || continue;
echo -n performance > $CPUFREQ;
done