Apparently Linux power management can be fairly aggressive and CPU downclocking can have serious effect on PyTorch jobs (even those running on CUDA).
To check the current mode of CPU 0 run the snippet below. On my machine it shows that all cores are in powersave
mode.
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
[ -f $CPUFREQ ] || continue;
echo $CPUFREQ $(cat $CPUFREQ)
done
To disable power saving run the following:
sudo cpupower frequency-set -g performance
This speeds up word_language_model
PyTorch example (with --cuda
flag) 2 times (116ms -> 57ms per batch). I'd expect the largest speedups on models that have a lot of small kernels that can't saturate the GPU.
Note that this might increase the power consumption significantly, so you might want to revert the setting to powersave
once you finish training (idle clock speeds increase from 1.2GHz to 2.4GHz on my machine after changing the mode).
Reference:
- Avoiding CPU Speed Scaling – Running CPU At Full Speed
- ArchLinux wiki
- Kernel docs - CPU governors
- Kernel docs - Intel P-State driver
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
[ -f $CPUFREQ ] || continue;
echo -n performance > $CPUFREQ;
done