Lenovo messed up with the X1E and P1 Gen 1 versions (and maybe later generations) in that the system boots with a thermal limit (aka Tjunction or tjmax) set to 82C (some report 80C). What this means is that regardless of power draw or under-volting settings, when your CPU hits 82C, it will drop the frequency down to the "Configurable TDP-down" frequency, or even lower. It will also may limits the system power draw.
First, note that I have already replaced the thermal paste on my P1's CPU and GPU with Noctua NT-H2 thermal compound (affiliate link). This immediately made a very noticable difference in idle temps and placing the laptop on my lap stayed cool. Also, the keyboard no longer got hot to the touch.
For stress testing under Linux, I used the s-tui
application to dig into the
details for all testing below.
The fix is really two steps:
- Set the Tjunction higher, say, -3 under your CPU's rated Tjunction value.
- Undervolt the CPU, Cache, Uncore, and iGPU to maximize your performance.
Lenovo released a software update that effectively sets the Tjunction back up to 97C. However, this is only for Windows, and there are many posts of where Hyper-V negates the setting. I am not sure, but perhaps Lenovo has fixed this with newer drivers since others reported it back in Q1 2018.
For Linux, we are left to fend for ourselves. Therefore, here's how to verify your system is affected, and how to fix it.
Two different ways to do this.
You can install the msr-tools
utility.
sudo apt install msr-tools
sudo modprobe msr
Then, read the field and convert it to a digit:
$ sudo rdmsr --bitfield 23:16 -d 0x00001a2
18
This means your system is set to -18C under your Tjunction max, which for my Xeon E-2176M is 100C. So, that would be 100 - 18, which is 82C max.
Current install instructions are on the github:
https://github.com/georgewhewell/undervolt
But in short, install it via pip under root (I know, anti-Python, but this needs root to access the DMA).
sudo pip install undervolt
Now, you can read the Tjunction directly (called temperature target):
$ sudo undervolt --read
temperature target: -18 (82C)
core: 0.0 mV
gpu: 0.0 mV
cache: 0.0 mV
uncore: 0.0 mV
analogio: 0.0 mV
powerlimit: 78.0W (short: 0.00244140625s - enabled) / 45.0W (long: 96.0s - enabled)
As you can see, mine is set to 82C.
Go lookup your CPU on Intel's Ark site and find its Tjunction value. My E-2176M has a max of 100C. You do NOT want to hit this 100C, ever! So we are going to set it to 97C instead, to leave a little headroom as sometime CPU temps spike 1C or 2C higher than your target temp while waiting on fans to ramp up. If you do hit your Tjunction max, your system will shut down out of safety.
Armed with target temp, mine being 97C, we can use the undervolt
utility
listed under the Verify section above.
sudo undervolt --temp 97
We can check it now:
$ sudo undervolt --read
temperature target: -3 (97C)
core: 0.0 mV
gpu: 0.0 mV
cache: 0.0 mV
uncore: 0.0 mV
analogio: 0.0 mV
powerlimit: 78.0W (short: 0.00244140625s - enabled) / 45.0W (long: 96.0s - enabled)
Now that my CPU ramps up to 97C, I went from 2700Mhz to 3400Mhz across all cores! However, this is still a far cry from its rated 4.4Ghz turbo setting. And, it only lasts about 10 seconds before it throttles pretty quickly down to 1500Mhz, and back up to 3400Mhz again. The reason is that our CPU is running at full voltage, which is hot. Intel processors run with more voltage than they need to account for unstable/inaccurate system voltage regulation.
To address this, I used undervolt
to find a safe setting for undervolting.
Here are my settings I found to be stable for the E-2176M:
sudo undervolt --temp 97 --core -150 --cache -150 --gpu -100 --uncore -100
And checking it's all set correctly:
$ sudo undervolt --read
temperature target: -3 (97C)
core: -150.39 mV
gpu: -99.61 mV
cache: -150.39 mV
uncore: -99.61 mV
analogio: 0.0 mV
powerlimit: 78.0W (short: 0.00244140625s - enabled) / 45.0W (long: 96.0s - enabled)
With these settings, I am connected to two Thunderbolt 3 docking stations, 3 1080p monitors, 5 USB external accessories, Brave browser open with about 29 tabs, and a couple of terminals on Pop_OS.
I ran s-tui
stress test for about 3 hours straight, while using the Brave browser
and watching youtube and various surfing. Zero issues.
All cores now hover around 3900Mhz to 4000Mhz, much closer to that Turbo of 4.4Gh and 35W of usage. It would still drop after a minute or two, but it only drops to 2200 or 2400Mhz now which is much better for the low before.
Your mileage may vary. Adjust the voltages 20mV at a time.
You'll want to read up on Undervolt's github site for how to persist it with
systemd
service. While I do use it, and my undervolting remains, my max
temp isn't sticking yet across all reboots. It's a hit or miss, more likely
a race condition with another service on startup. I'll setup the timer as
described in the Undervolt instructions later.
Enjoy!
@eduncan911 I have already disabled hyperthreading. So this is about different physical cores running at higher temps and throttling as a result. I also find the fans come on sooner than I'd expect with just one moderate application running, like Zoom. The more I read, the more I think that repasting might help. I found a pretty good guide https://imgur.com/a/Blvpjd0 with comments at https://www.reddit.com/r/thinkpad/comments/a14vi2/basic_repasting_guide_for_the_lenovo_extreme_x1/