Running any intensive job immediately throttles the CPU to 400Mhz! I found this:
https://www.reddit.com/r/thinkpad/comments/pvb87e/thinkpad_p14s_gen_2_intel_fix_for_aggressive_cpu/
which sent me to https://forums.lenovo.com/t5/Other-Linux-Discussions/X1C6-T480s-low-cTDP-and-trip-temperature-in-Linux/m-p/4028489?page=40#5069052 which explained there are actual Fn keys for this!
Can you check something for me. Do:
cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
And then try FN+L, FN+M, FN+H with the command above in between each one. These function keys should be switching between low, medium and high mode and you should see the power limit shift with each key press.
FN+H seems to have helped significantly!
Need to go through https://www.reddit.com/r/thinkpad/wiki/os/linux/
Try disabling Lenovo Intelligent Thermal Solution Service in the BIOS. No suchj thing, seems to be Windows shit.
OK, found this: https://forums.lenovo.com/topic/findpost/1306/5087833/5530806
The good news is that, at least on Linux, the thinkpad_acpi driver allows you to set the fan level and the intel_rapl driver allows you to set a reasonable power budget (the data sheet for my CPU calls for 12 to 28 watts depending on how much performance you want).
I put a bit more background here https://github.com/daniel-kristjansson/smart-fancontrol/blob/main/README.md
Readme has a lot of info, but suggests using thermal-deamon instead (quoting the readme):
Ubuntu actually ships with a deamon that manages the power budget! It's the thermal-deamon mentoned in the alternatives section above. Unfortunately, it disables itself when the thinkpad_acpi driver is loaded. You can make sure it does its job by adding --ignore-cpuid-check to the ExecStart line in the systemd /lib/systemd/system/thermald.service file. This will prevent the catostrophic level of throttling that happens when nothing is managing the power budget.
So:
sudo pacman -S thermald
Then edit /lib/systemd/system/thermald.service and change the ExecStart command to:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive
sudo systemctl enable thermald.service
Holy fuck! I am now in a GMeet, with CPUs at slightly > 1000 MHz! No idea what will happen after restarting, but so far so good!
Not that good. 1200MHz is too slow for meet and anything else. Better than 400MHz, yes, but still way too slow to be usable. However, I just tried
sudo systemctl restart thermald.service
And then:
[root@oregano ~]# sudo systemctl status thermald.service
○ thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: inactive (dead) since Fri 2022-07-22 15:17:35 BST; 4s ago
Duration: 3ms
Process: 3132908 ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive (code=exited, status=0/SUCCESS)
Main PID: 3132908 (code=exited, status=0/SUCCESS)
CPU: 9ms
Jul 22 15:17:35 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 15:17:35 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 15:17:35 oregano thermald[3132908]: 27 CPUID levels; family:model:stepping 0x6:8c:1 (6:140:1)
Jul 22 15:17:35 oregano thermald[3132908]: [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run on this p>
Jul 22 15:17:35 oregano thermald[3132908]: Unsupported cpu model or platform
Jul 22 15:17:35 oregano systemd[1]: thermald.service: Deactivated successfully.
It isn't active, and suddenly my CPUs are at 2500MHz! WTF!?!?!?
Trying to open two meet instances, one in chromium one in brave (talking to myself again). Was working perfectly, at 2.5GHz for a while, then I pressed Fn+L and checked that /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw was back down to 6000000 and CPU down to 1.2GHz, but Fn+H doesn't bring it back up. I tried
# echo 35000000 > /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw ; cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
Which worked to bring the value up, but the CPU speed stayed constant and everything is slow again. Reboot and come back.
After reboot, output of systemctl status thermald.service is the same as above, (deactivated) and CPUs are now at 400MHz again.
# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
8000000
Try Fn+H: worked, we're back to 1.1MHz and
# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
10000000
Still clearly capped at 1.2 though.
Found https://forums.lenovo.com/t5/Linux-Discussion/T480s-low-cTDP-and-trip-temperature-in-Linux/td-p/4028489?page=46 which suggests a different set of parameters in /lib/systemd/system/thermald.service:
If you pass "--ignore-cpuid-check" to thermald, then it should still run on those platforms. Disabling the check currently does not work in adaptive mode. Probably these people don't want "--adaptive" anyway, but it likely makes sense to explicitly remove "--adaptive" and add "--ignore-cpuid-check" at the same time.
Yeah, that makes it work, but no dice: back to capping, at 900MHz this time and Fn+H makes no difference (I also tried toggling with Fn+L and Fn+M and then Fn+H). Governor is set to powersave though:
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
[root@oregano ~]#
Try changing that:
[root@oregano ~]# echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
[root@oregano ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
[root@oregano ~]# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
10000000
Nope, still slow. We're now back to 1.2GHz which is better but still crap. I also tried forcing 35000000 which is what I had seen in /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw when it was working well, but no dice:
# echo 35000000 > /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw ; cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
35000000
# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
35000000
Showing the right value, but CPUs capped at 1.2GHz. Try changing the thermald service file to have adaptive as well:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check --adaptive
Then
systemctl daemon-reload
systemctl restart thermald.service ; systemctl status thermald.service
Loaded but capped. Removed the --ignore-cpuid-check
, left --adaptive
, restarted and now it doesn't load BUT we're back at decent speeds again, so WTF!? I'm now at >2.5GHz. Do I need to remove thermald completely or something?
# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
35000000
I will try a full shutdown and then turn on again and see where I'm at.
I also re-anabled the intel power whatever service and set it to max performance but I'm now back to400MHz.
back to 1200 and:
cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
8000000
Fn+H takes me to
cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw
10000000
And 1000MHz. Thermald is still failing:
○ thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: inactive (dead) since Fri 2022-07-22 16:32:46 BST; 3min 9s ago
Duration: 4ms
Process: 454 ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive (code=exited, status=0/SUCCESS)
Main PID: 454 (code=exited, status=0/SUCCESS)
CPU: 17ms
Jul 22 16:32:46 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 16:32:46 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 16:32:46 oregano thermald[454]: NO RAPL sysfs present
Jul 22 16:32:46 oregano thermald[454]: 27 CPUID levels; family:model:stepping 0x6:8c:1 (6:140:1)
Jul 22 16:32:46 oregano thermald[454]: [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run on this platform
Jul 22 16:32:46 oregano thermald[454]: Unsupported cpu model or platform
Jul 22 16:32:46 oregano systemd[1]: thermald.service: Deactivated successfully.
But governor is back to powersave:
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
Change it:
# echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
Still capped... try the thermald restart thing again nope, restarting it made no change. Try re-adding the --ignore-cpuid-check:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check --adaptive
Sigh. No, still capped after systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
. Go back to:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive
then
systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
And there we go, this works. I'm now at my full 2.8GHz. And if I remove --adaptive?
ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check
and:
# systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
● thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: active (running) since Fri 2022-07-22 16:56:59 BST; 7ms ago
Main PID: 24393 (thermald)
Tasks: 2 (limit: 38139)
Memory: 3.3M
CPU: 11ms
CGroup: /system.slice/thermald.service
└─24393 /usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check
Jul 22 16:56:59 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 16:56:59 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: Config file /etc/thermald/thermal-conf.xml does not exist
Jul 22 16:56:59 oregano thermald[24393]: Config file /etc/thermald/thermal-conf.xml does not exist
Still good. So how do I automate this? Will try another hard reboot (shutdown, restart) and see what's up.
This time, I saw a "FAILED to start simple and lightweight fan manager" message, thinkfan seems to not have loaded. Indeed, fans are very loud, but still throttled at 1.1-1.2 GHz.
# systemctl status thinkfan.service
× thinkfan.service - simple and lightweight fan control program
Loaded: loaded (/usr/lib/systemd/system/thinkfan.service; enabled; preset: disabled)
Drop-In: /etc/systemd/system/thinkfan.service.d
└─override.conf
Active: failed (Result: exit-code) since Fri 2022-07-22 17:25:58 BST; 2min 42s ago
Process: 463 ExecStart=/usr/bin/thinkfan $THINKFAN_ARGS (code=exited, status=1/FAILURE)
CPU: 18ms
Jul 22 17:25:58 oregano systemd[1]: Starting simple and lightweight fan control program...
Jul 22 17:25:58 oregano thinkfan[463]: ERROR: /etc/thinkfan.conf:11:
name: thinkpad
^
Could not find a hwmon with this name.
Jul 22 17:25:58 oregano systemd[1]: thinkfan.service: Control process exited, code=exited, status=1/FAILURE
Jul 22 17:25:58 oregano systemd[1]: thinkfan.service: Failed with result 'exit-code'.
Jul 22 17:25:58 oregano systemd[1]: Failed to start simple and lightweight fan control program.
That looks like it's because of this section in /etc/thinkfan.conf:
# Chassis
- hwmon: /sys/class/hwmon name: thinkpad indices: [3, 5, 6, 7]
I will comment that out and restart thinkfan. Yep, that worked. Whatevs, still throttled, I'm back to powersave:
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
Fn+H made no difference to that. So I did:
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
Which worked. Still throttled, but now I have the right governor.
systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
Still throttled. It looks like I really need to edit the damn config file every time. Try sed:
sed -i 's/--ignore-cpuid-check//' /lib/systemd/system/thermald.service; sed -i '/ExecStart=/s/$/--ignore-cpuid-check/' /lib/systemd/system/thermald.service systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
nope. Fn+H now? Nope. Remove --ignore-cpuid-check with emacs and then reload again? No change, thermald working still throttled. Put it back? same. Fn+H again?
Tried some more combinations of editing and eventually I got rid of the throttle again with:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive
And thermald off.
OK, try disabling thermald and rebooting.
# systemctl disable thermald.service
Removed "/etc/systemd/system/multi-user.target.wants/thermaldq.service".
Removed "/etc/systemd/system/dbus-org.freedesktop.thermald.service".
# shutdown -h now
OK, rebooted straight to 1.2GHz capping now. Governor back to powersave. Changing to performance
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
still capped.
systemctl status thermald.service ○ thermald.service - Thermal Daemon Service Loaded: loaded (/usr/lib/systemd/system/thermald.service; disabled; prese> Active: inactive (dead)
As soon as I opened chromium, I got capped to 400MHz. Fn+H raised it to 900MHz.
Reanable thermald:
#grep ExecStart /lib/systemd/system/thermald.service
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive
systemctl enable thermald.service
systemctl start thermald.service
Status is back to [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run , so add the --cpu thing
ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check
then
systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
Started, but still throttled. Put back to
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive
Still throttled, thermald not on.
ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check
Nope, still throttled. Played around a bit more and in the end this worked:
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpuid-check
With thermald running:
# systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
● thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: active (running) since Fri 2022-07-22 18:10:52 BST; 6ms ago
Main PID: 16468 (thermald) Tasks: 2 (limit: 38139) Memory: 3.2M CPU: 7ms CGroup: /system.slice/thermald.service └─16468 /usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpuid-check
Jul 22 18:10:52 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:10:52 oregano systemd[1]: Started Thermal Daemon Service.
So enable the service, hard reboot (shutdown, then restart). Back to therottling, bloody hell.
[root@oregano ~]# systemctl status thermald.service
● thermald.service - Thermal Daemon Service Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset> Active: active (running) since Fri 2022-07-22 18:13:15 BST; 1min 17s ago Main PID: 409 (thermald) Tasks: 4 (limit: 38139) Memory: 6.2M CPU: 135ms CGroup: /system.slice/thermald.service └─409 /usr/bin/thermald --systemd --dbus-enable --adaptive --igno>
Jul 22 18:13:15 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:13:15 oregano thermald[409]: NO RAPL sysfs present
Jul 22 18:13:15 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: Polling mode is enabled: 4
NO RAPL sysfs present? What if I just restart it a few times? Yes! restarting fixed it without modifying config files!
● thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disab>
Active: active (running) since Fri 2022-07-22 18:15:38 BST; 19s ago
Main PID: 4890 (thermald) Tasks: 4 (limit: 38139) Memory: 3.6M CPU: 32ms CGroup: /system.slice/thermald.service └─4890 /usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpu>
Jul 22 18:15:38 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:15:38 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: Polling mode is enabled: 4
I'm now above 2GHz even with the powersave governor!
Last time, reboot and then just restart the service and see if that fixes everything again.
YESH!
OK, try adding a systemd-timer to do this a minute after each reboot so I don't need to restart manually":
https://wiki.archlinux.org/title/Systemd/Timers#Timer_units
emacs /etc/systemd/system/thermaldRestart.timer
Add:
[Unit]
Description=Restart the thermald service to make it work (see ~terdon/README.install)
[Timer]
OnBootSec=1min
[Install]
WantedBy=timers.target
Then /etc/systemd/system/thermaldRestart.service
:
[Unit]
Description=Restart Thermal Daemon Service
ConditionVirtualization=no
[Service]
Type=oneshot
ExecStart=/sbin/systemctl restart thermald.service
And:
$ systemctl enable thermaldRestart.timer
Created symlink /etc/systemd/system/timers.target.wants/thermaldRestart.timer → /etc/systemd/system/thermaldRestart.timer.
Try rebooting and check if it works. YESH! Works!