Asus (formerly Intel) NUCs weren't designed for use as a 24/7 server, though they see enthusiastic use as such. Some users report instability. And while some of that can be bad RAM or a faulty motherboard / CPU, some can also just be configuration issues. And some is plain heat.
It's a little tough to nail it down. Here are some good practices to follow, to have a stable NUC.
Particularly if it's an older NUC, upgrade the BIOS. These are now hosted by Asus and may be a little hard to find for older models.
- Get your model number / SKU. For example, NUC10i7FNH
- On the ASUS download center, look for the Model search box and enter the model number.
- On the right side of the page the model number will show up, with options underneath. Click on "Drivers and Tools"
- On the next page select "Bios and Firmware"
User DagoDuck says this about NUC11 firmware: "For the TNTGL357 BIOS (NUC 11), if you're below version 0071, you first have to update to 0071 before being able to apply the latest update (newest version atm is 0077).
To obtain the file for version 0071, you have to modify the URL of the download link, as the older version isn't referenced anywhere on the ASUS website."
The current gen 14 Pro BIOS for example is here: https://www.asus.com/displays-desktops/nucs/nuc-mini-pcs/asus-nuc-14-pro/helpdesk_bios?model2Name=ASUS-NUC-14-Pro-Kit
The default setting for these is "broil". That means the fans run a lot, which can lead to dust build up inside until airflow is completely blocked.
Cleaning this requires removing the board so the fan assembly underneath is accessible.
You can check heat by using smartctl. Install smartmontools:
sudo apt update && sudo install smartmontools
Then run sudo smartctl -x /dev/nvme0n1
and look for the temperature of the drive. You'd like to see it below 50C, certainly not above 60C.
Setting the power level / limit to something less aggressive is generally better for 24/7 running.
First, find the data sheet for your CPU, simply by googling the CPU model. Here's the one for an Intel Core 5 125H.
In the data sheet, look for one of:
- Minimum Assured Power
- TDP down
- TDP low
- TDP min
Boot the NUC and press F2 to get into BIOS.
Under "Power", you'll find "Package Power Limit 1", or possibly simply "Power Level 1". Set that to the min value you found, which is specific to the CPU. For this specific CPU, 20W.
Then, set "Package Power Limit 2" to the nearest number that's 1.25x to 1.3x that. In this example, that'd be 26W.
And that's it for controlling heat.
Some models of NVMe drives can cause the system to lock up when the drive enables power savings. This can be cured by changing the kernel startup parameters.
sudo nano /etc/default/grub
Find the line GRUB_CMDLINE_LINUX_DEFAULT
. Add to it, keeping what's already there: vme_core.default_ps_max_latency_us=0 pcie_aspm=off
. Save the file with Ctrl-X.
sudo update-grub
, then sudo reboot
This keeps the drive from entering powersave states by itself
There are some reports that the Ubuntu 22.04 kernel can cause the Ethernet driver to lock up.
There are also reports that this resolved after a BIOS update and updating the kernel by either using the hwe kernel package with sudo apt install --install-recommends linux-generic-hwe-22.04
or by
upgrading to Ubuntu 24.04.
The BIOS update is key, we also have reports of the new kernel alone not resolving the issue.
Some users side-stepped the issue entirely by using an Ethernet USB dongle instead of the built-in Ethernet.
All consumer RAM can fail, and it'll do so silently. The symptoms can range from corrupted data to the NUC "freezing".
To rule out (intermittently) faulty RAM, donwload memtest86+, flash it to an USB stick, boot from that USB, and run a continuous loop memory test for 5 days (!) or until you see errors, whichever is earlier.
If no errors were seen after 5 days, the RAM is probably fine.
A single run, or even 5 runs, of the memory test, are inconclusive. RAM failures can be quite intermittent.