Disclaimer: Please read this being aware of the fact that I'm not an expert regarding the things outlined below, however I made my best attempt.
Linux is in general defaulting to values and settings for a server workload with security/hardening in mind, distros usually tweak those a bit towards a more desktop based workload. thought id list a few things ive found out that can be tweaked a bit depending on your own hardware and personal preferences.
Lenovo Legion 7i 2023:
- 13th Gen Intel(R) Core(TM) i9-13900HX
- NVIDIA GeForce RTX 4080
- 2 x (16GB) Kingston FURY Impact Black KF556S40IB-16
- 2 x 1TB nvme disks
sysctl is a tool used to modify kernel parameters at runtime. The parameters available are those listed under /proc/sys/ and there are a lot of them.
the biggest peformance hit i struck is with the I/O disk write cache.
making this cache to small especially now that nvme disks are reaching mindboggling speeds it can actually bottleneck. causing stalls because not enough data has been fitted into memory to be able to write it out. and slow down your throughput or causing stalls. especially noticeable when i have 2 nvme disks if both are moving data at 5000mb/s. and both begins fighting over the cache with 100% access to all of it.
setting it to high causes the obvious higher ram usage because the file data is taking up more ram if not written out fast enough and the potential risk for data loss for example with usb sticks. you read in a big file, its all in memory and flushing it out as fast as the usb stick allows but this is happening in the background and your filemanager or tool thinks the operation is done so if you hotplug it while this is going on the data is gonna be lost. you can also experience stalls on slow disks because the write operation thinks its done but its in the cache and something else wants to read it, well its not flushed out so its gonna have to wait.
to set the size of this cache/buffer we can set vm.dirty_ratio
or its counterpart vm.dirty_bytes
if one of them is set, the other one will automatically be read as 0
.
ratio is the percentage of the total amount of free and reclaimable memory when this is exceeded, applications that want to write to the pagecache are blocked and wait for kernel background flusher threads to reduce the amount of dirty memory.
bytes is the same thing only a exact size in bytes instead of percentage based of total ram.
to set the the threshold size when to begin flushing out data we can set vm.dirty_background_ratio
and its counterpart vm.dirty_background_bytes
just as above if one of them is set, the other one will automatically be read as 0
.
ratio is the percentage of the total amount of free and reclaimable memory. When the amount of dirty pagecache exceeds this threshold, writeback threads start writing out the data.
a sane rule would be to set vm.dirty_background_ratio
or vm.dirty_background_bytes
to 1/2 the size of the total cache size or 1/4 of the size.
there is also a timer that marks the data for writing out if we dont reach the treshold size. vm.dirty_expire_centisecs
but the default value seems like a good enough one.
so here is what i did to set all of these things and also set a per device strict ratio to mitigate the to large cache size issues that can arise.
8% of 32gb is ~2.5gb and when 4% of 32gb ~1.25gb is written to the cache start flushing it out, or when the default timer is reached.
/etc/sysctl.d/99-options.conf
vm.dirty_ratio = 8
vm.dirty_background_ratio = 4
now to set a more per device setting so both my nvme disks always has room in the cache and reduce the likelyhood of loosing usb stick data.
/etc/udev/rules.d/91-io.rules
# ssds
ACTION=="add|change", KERNEL=="sd[a-z]*|mmcblk[0-9]*|nvme[0-9]*", ATTR{queue/rotational}=="0", RUN+="/usr/bin/write-cache %k 70"
# usb sticks
ACTION=="add|change", KERNEL=="sd[a-z]", ENV{ID_USB_TYPE}=="disk", RUN+="/usr/bin/write-cache %k 5"
/usr/bin/write-cache
#!/bin/bash
device=$1
max_ratio=$2
strict_limit=1
if [[ -z "$device" || -z "$max_ratio" ]]; then
exit 1
fi
echo "$strict_limit" > "/sys/block/$device/bdi/strict_limit"
echo "$max_ratio" > "/sys/block/$device/bdi/max_ratio"
strict_lmit forces per-BDI checks for the share of given device in the write-back cache even before the global background dirty limit is reached. Turning strictlimit on has no visible effect if max_ratio is equal to 100%. max_ratio limits a particular device to use not more than the given percentage of the write-back cache. This is useful in situations where we want to avoid one device taking all or most of the write-back cache.
so now both my disks has ~1.75gb of the total 2.5gb just so none of them can starve out the entire cache and force the other to stall, if such situation arises. and usb sticks gets ~125mb. so now a 1gb sized file wont instantly go into ram and think its done.