Skip to content

Instantly share code, notes, and snippets.

@yorickdowne
Last active November 15, 2024 23:14
Show Gist options
  • Save yorickdowne/f3a3e79a573bf35767cd002cc977b038 to your computer and use it in GitHub Desktop.
Save yorickdowne/f3a3e79a573bf35767cd002cc977b038 to your computer and use it in GitHub Desktop.
Great and less great SSDs for Ethereum nodes

Overview

Syncing an Ethereum node is largely reliant on latency and IOPS, I/O Per Second, of the storage. Budget SSDs will struggle to an extent, and some won't be able to sync at all. For simplicity, this page treats IOPS as a proxy for/predictor of latency.

This document aims to snapshot some known good and known bad models.

The drive lists are ordered by interface and then by capacity and alphabetically by vendor name, not by preference. The lists are not exhaustive at all. @mwpastore linked a filterable spreadsheet in comments that has a far greater variety of drives and their characteristics. Filter it by DRAM yes, NAND Type TLC, Form Factor M.2, and desired capacity.

For size, 4TB comes recommended as of mid 2024. The smaller 2TB drive should last an Ethereum full node until early 2025 or thereabouts, with crystal ball uncertainty. The Portal team aim to make 2TB last forever with EIP-4444 by late 2024. Remy wrote a migration guide to 4TB.

High-level, QLC and DRAMless are far slower than "mainstream" SSDs. QLC has lower endurance as well. Any savings will be gone when the drive fails early and needs to be replaced.

Other than a slow SSD model, these are things that can slow IOPS down:

  • Heat. Check with smartctl -x; the SSD should be below 50C so it does not throttle.
  • TRIM not being allowed. This can happen with some hardware RAID controllers, as well as on macOS with non-Apple SSDs
  • ZFS
  • RAID5/6 - write amplification is no joke
  • On SATA, the controller in UEFI/BIOS set to anything other than AHCI. Set it to AHCI for good performance.

If you haven't already, do turn off atime on your DB volume, it'll increase SSD lifetime and speed things up a little bit.

Some users have reported that NUC instability with certain drives can be cured by adding nvme_core.default_ps_max_latency_us=0 pcie_aspm=off to their GRUB_CMDLINE_LINUX_DEFAULT kernel parameters via sudo nano /etc/default/grub and sudo update-grub. This keeps the drive from entering powersave states by itself.

The Good

"Mainstream" and "Performance" drive models that can sync mainnet execution layer clients in a reasonable amount of time.

  • Higher endurance (TBW) than most: Seagate Firecuda 530, WD Red SN700
  • Lowest power draw: SK Hynix P31 Gold - was a great choice for Rock5 B and other low-power devices, but 2TB only

We've started crowd-sourcing some IOPS numbers. If you want to join the fun, run fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75; rm test and give us the read and write IOPS.

If you have room for it and need an excellent heatsink, consider the "Rocket NVMe Heatsink". It is quite high however, and may not fit in some miniPC cases.

Hardware

M.2 NVMe "Mainstream" - TLC, DRAM, PCIe 3, 4TB drives

  • Any data center/enterprise NVMe SSD
  • Teamgroup MP34, between 94k/31k and 118k/39k r/w IOPS
  • WD Red SN700, 141k/47k r/w IOPS

M.2 NVMe "Performance" - TLC, DRAM, PCIe 4 or 5, 4TB drives

  • Any data center/enterprise NVMe SSD
  • Acer GM7000 "Predator", 125k/41k r/w IOPS
  • ADATA XPG Gammix S70, 272k/91k r/w IOPS
  • Corsair Force MP600 Pro and variants (but not "MP600 Core XT"), 138k/46k r/w IOPS
  • Crucial T700, 215k/71k r/w IOPS
  • Kingston KC3000, 377k/126k r/w IOPS
  • Kingston Fury Renegade, 211k/70k r/w IOPS
  • Mushkin Redline Vortex (but not LX)
  • Sabrent Rocket 4 Plus, 149k/49k r/w IOPS. @SnoepNFTs reports the Rocket NVMe Heatsink keeps it very cool.
  • Samsung 990 Pro, 124k/41k r/w IOPS - there are reports of 990 Pro rapidly losing health. A firmware update to 1B2QJXD7 is meant to stop the rapid degradation, but won't reverse any that happened on earlier firmware.
  • Seagate Firecuda 530, 218k/73k r/w IOPS
  • Teamgroup MP44, 105k/35k r/w IOPS - caution that this is DRAMless and uses a Host Memory Buffer (HMB), yet appears to perform fine.
  • Transcend 250s, 127k/42k r/w IOPS. @SnoepNFTs reports it gets very hot, you'd want to add a good heatsink to it.
  • WD Black SN850X, 101k/33k r/w IOPS

M.2 NVMe "Mainstream" - TLC, DRAM, PCIe 3, 2TB drives

  • Any data center/enterprise NVMe SSD
  • AData XPG Gammix S11/SX8200 Pro. Several hardware revisions. It's slower than some QLC drives. 68k/22k r/w IOPS
  • AData XPG Gammix S50 Lite
  • HP EX950
  • Mushkin Pilot-E
  • Samsung 970 EVO Plus 2TB, pre-rework (firmware 2B2QEXM7). 140k/46k r/w IOPS
  • Samsung 970 EVO Plus 2TB, post-rework (firmware 3B2QEXM7 or 4B2QEXM7). In testing this syncs just as quickly as the pre-rework drive
  • SK Hynix P31 Gold
  • WD Black SN750 (but not SN750 SE)

M.2 NVMe "Performance" - TLC, DRAM, PCIe 4 or 5, 2TB drives

  • Any data center/enterprise NVMe SSD
  • Crucial P5 Plus
  • Kingston KC2000
  • Samsung 980 Pro (not 980) - a firmware update to 5B2QGXA7 is necessary to keep them from dying, if they are firmware 3B2QGXA7. Samsung's boot Linux is a bit broken, you may want to flash from your own Linux.
  • SK Hynix P41 Platinum / Solidigm P44 Pro, 99k/33k r/w IOPS
  • WD Black SN850

Cloud

  • Any baremetal/dedicated server service
  • AWS i3en.(2)xlarge or is4gen.xlarge
  • AWS gp3 w/ >=10k IOPS provisioned and an m7i/a.xlarge

The Bad

These "Budget" drive models are reportedly too slow to sync (all) mainnet execution layer clients.

Hardware

  • AData S40G/SX8100 4TB, QLC - the 2TB model is TLC and should be fine; 4TB is reportedly too slow
  • Crucial P1, QLC - users report it can't sync Nethermind
  • Crucial P2 and P3 (Plus), QLC and DRAMless - users report it can't sync Nethermind, 27k/9k r/w IOPS
  • Kingston NV1 - probably QLC and DRAMless and thus too slow on 2TB, but could be "anything" as Kingston do not guarantee specific components.
  • Kingston NV2 - like NV1 no guaranteed components
  • WD Green SN350, QLC and DRAMless
  • Anything both QLC and DRAMless will likely not be able to sync at all or not be able to consistently keep up with "chain head"
  • Crucial BX500 SATA, HP S650 SATA, probably most SATA budget drives
  • Samsung 980, DRAMless - unsure, this may belong in "Ugly". If you have one and can say for sure, please come to ethstaker Discord.
  • Samsung T7 USB, even with current firmware

The Ugly

"Budget" drive models that reportedly can sync mainnet execution layer clients, if slowly.

Note that QLC drives usually have a markedly lower TBW than TLC, and will fail earlier.

Hardware

  • Corsair MP400, QLC
  • Inland Professional 3D NAND, QLC
  • Intel 660p, QLC. It's faster than some "mainstream" drives. 98k/33k r/w IOPS
  • Seagata Barracuda Q5, QLC
  • WD Black SN770, DRAMless
  • Samsung 870 QVO SATA, QLC

2.5" SATA "Mainstream" - TLC, DRAM

  • These have been moved to "ugly" because there are user reports that only Nimbus/Geth will now sync on SATA, and even that takes 3 days. It looks like after Dencun, NVMe is squarely the way to go.
  • Any data center/enterprise SATA SSD
  • Crucial MX500 SATA, 46k/15k r/w IOPS
  • Samsung 860 EVO SATA, 55k/18k r/w IOPS
  • Samsung 870 EVO SATA, 63k/20k r/w IOPS
  • WD Blue 3D NAND SATA

Cloud

  • Netcup RS G11 Servers. Impressively fast; but it still depends on your neighbors in the service.
  • Contabo SSD - reportedly able to sync Geth 1.13.0 and Nethermind, if slowly
  • Netcup VPS Servers - reportedly able to sync Geth 1.13.0 and Nethermind, if slowly
  • Contabo NVMe - fast enough but not enough space. 800 GiB is not sufficient.
@tlsol
Copy link

tlsol commented Mar 25, 2024

Samsung PM863a 3.84TB TLC - SATA
image
Also very durable and affordable server SSDs

@yorickdowne
Copy link
Author

"Data center SSD drives will also work well." - absolutely

@laurenzberger
Copy link

Solidigm P44 Pro 2TB (apparently same as SK Hynix Platinum P41)

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.35
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=397MiB/s,w=132MiB/s][r=102k,w=33.8k IOPS][eta 00m:00s] 
test: (groupid=0, jobs=1): err= 0: pid=21095: Thu Mar 28 14:39:36 2024
  read: IOPS=99.0k, BW=387MiB/s (405MB/s)(113GiB/297984msec)
   bw (  KiB/s): min=369432, max=427240, per=100.00%, avg=396158.20, stdev=7717.86, samples=595
   iops        : min=92358, max=106810, avg=99039.48, stdev=1929.49, samples=595
  write: IOPS=33.0k, BW=129MiB/s (135MB/s)(37.5GiB/297984msec); 0 zone resets
   bw (  KiB/s): min=123927, max=142888, per=100.00%, avg=132031.68, stdev=2768.09, samples=595
   iops        : min=30981, max=35722, avg=33007.83, stdev=692.04, samples=595
  cpu          : usr=11.94%, sys=87.86%, ctx=5183, majf=0, minf=10
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=387MiB/s (405MB/s), 387MiB/s-387MiB/s (405MB/s-405MB/s), io=113GiB (121GB), run=297984-297984msec
  WRITE: bw=129MiB/s (135MB/s), 129MiB/s-129MiB/s (135MB/s-135MB/s), io=37.5GiB (40.3GB), run=297984-297984msec

Disk stats (read/write):
    dm-1: ios=29475801/9827719, merge=0/0, ticks=1779272/215384, in_queue=1994656, util=100.00%, aggrios=29492439/9833359, aggrmerge=0/0, aggrticks=1750608/206020, aggrin_queue=1956628, aggrutil=100.00%
    dm-0: ios=29492439/9833359, merge=0/0, ticks=1750608/206020, in_queue=1956628, util=100.00%, aggrios=29492409/9832110, aggrmerge=30/1266, aggrticks=1496885/97981, aggrin_queue=1594922, aggrutil=100.00%
  nvme0n1: ios=29492409/9832110, merge=30/1266, ticks=1496885/97981, in_queue=1594922, util=100.00%

@laurenzberger
Copy link

Btw do we know if turning on disk encryption / LUKS under Ubuntu has a negative impact on IOPS?

@Beanow
Copy link

Beanow commented Mar 29, 2024

Another Samsung SSD 990 Pro 4TB here, pretty much same results as @SnoepNFTs
https://gist.github.com/yorickdowne/f3a3e79a573bf35767cd002cc977b038?permalink_comment_id=4958391#gistcomment-4958391

It came preinstalled with the current latest firmware, 4B2QJXD7.

I might want to look at cooling too under actual staking load, got up to 61 degrees for fio (thermal pad goes to chassis).

With PCIe3 Nuc, booted from the drive

Intel Core i5-8259U, 1x8 GB ram.
Standard setup Ubuntu 22 server, LVM2 - Ext4 root partition.
Testing while booted from the same drive.

$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=528MiB/s,w=175MiB/s][r=135k,w=44.9k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=4836: Fri Mar 29 18:28:36 2024
  read: IOPS=136k, BW=530MiB/s (556MB/s)(113GiB/217450msec)
   bw (  KiB/s): min=530312, max=550488, per=100.00%, avg=542804.63, stdev=3349.45, samples=434
   iops        : min=132578, max=137622, avg=135701.18, stdev=837.40, samples=434
  write: IOPS=45.2k, BW=177MiB/s (185MB/s)(37.5GiB/217450msec); 0 zone resets
   bw (  KiB/s): min=175856, max=184608, per=100.00%, avg=180905.97, stdev=1344.54, samples=434
   iops        : min=43964, max=46152, avg=45226.49, stdev=336.13, samples=434
  cpu          : usr=25.22%, sys=64.23%, ctx=9698025, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=530MiB/s (556MB/s), 530MiB/s-530MiB/s (556MB/s-556MB/s), io=113GiB (121GB), run=217450-217450msec
  WRITE: bw=177MiB/s (185MB/s), 177MiB/s-177MiB/s (185MB/s-185MB/s), io=37.5GiB (40.3GB), run=217450-217450msec

Disk stats (read/write):
    dm-0: ios=29484828/9831160, merge=0/0, ticks=1113068/84136, in_queue=1197204, util=100.00%, aggrios=29492392/9837065, aggrmerge=0/347, aggrticks=1129394/104103, aggrin_queue=1233551, aggrutil=99.99%
  nvme0n1: ios=29492392/9837065, merge=0/347, ticks=1129394/104103, in_queue=1233551, util=99.99%
Above but with nvme_core.default_ps_max_latency_us=0 +3k/+1k
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=541MiB/s,w=178MiB/s][r=138k,w=45.6k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1520: Fri Mar 29 18:41:21 2024
  read: IOPS=139k, BW=543MiB/s (569MB/s)(113GiB/212288msec)
   bw (  KiB/s): min=541560, max=564745, per=100.00%, avg=556696.58, stdev=4091.32, samples=424
   iops        : min=135390, max=141186, avg=139174.08, stdev=1022.81, samples=424
  write: IOPS=46.3k, BW=181MiB/s (190MB/s)(37.5GiB/212288msec); 0 zone resets
   bw (  KiB/s): min=180120, max=188672, per=100.00%, avg=185536.00, stdev=1475.02, samples=424
   iops        : min=45030, max=47168, avg=46383.90, stdev=368.75, samples=424
  cpu          : usr=25.22%, sys=66.61%, ctx=9314979, majf=1, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=543MiB/s (569MB/s), 543MiB/s-543MiB/s (569MB/s-569MB/s), io=113GiB (121GB), run=212288-212288msec
  WRITE: bw=181MiB/s (190MB/s), 181MiB/s-181MiB/s (190MB/s-190MB/s), io=37.5GiB (40.3GB), run=212288-212288msec

Disk stats (read/write):
    dm-0: ios=29465825/9850565, merge=0/0, ticks=1110740/86064, in_queue=1196804, util=100.00%, aggrios=29492604/9913113, aggrmerge=70/769, aggrticks=1195797/274537, aggrin_queue=1470392, aggrutil=100.00%
  nvme0n1: ios=29492604/9913113, merge=70/769, ticks=1195797/274537, in_queue=1470392, util=100.00%

Update: Running from newer hardware. (+55k/+18k compared to before).
AMD Ryzen 5 7535HS, 32GB DDR5, PCIe4.
Booting from drive, same OS install, without kernel flags.

$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=753MiB/s,w=250MiB/s][r=193k,w=64.1k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1757: Mon Apr  1 16:17:35 2024
  read: IOPS=191k, BW=744MiB/s (780MB/s)(113GiB/154814msec)
   bw (  KiB/s): min=709368, max=786112, per=100.00%, avg=762292.92, stdev=12028.39, samples=309
   iops        : min=177342, max=196528, avg=190573.26, stdev=3007.09, samples=309
  write: IOPS=63.5k, BW=248MiB/s (260MB/s)(37.5GiB/154814msec); 0 zone resets
   bw (  KiB/s): min=236904, max=262448, per=100.00%, avg=254056.41, stdev=4101.30, samples=309
   iops        : min=59226, max=65612, avg=63514.10, stdev=1025.31, samples=309
  cpu          : usr=11.90%, sys=75.15%, ctx=9206371, majf=1, minf=6
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=744MiB/s (780MB/s), 744MiB/s-744MiB/s (780MB/s-780MB/s), io=113GiB (121GB), run=154814-154814msec
  WRITE: bw=248MiB/s (260MB/s), 248MiB/s-248MiB/s (260MB/s-260MB/s), io=37.5GiB (40.3GB), run=154814-154814msec

Disk stats (read/write):
    dm-0: ios=29448145/9814651, merge=0/0, ticks=1062404/66404, in_queue=1128808, util=99.98%, aggrios=29492582/9829489, aggrmerge=0/54, aggrticks=1120690/77989, aggrin_queue=1198724, aggrutil=99.94%
  nvme0n1: ios=29492582/9829489, merge=0/54, ticks=1120690/77989, in_queue=1198724, util=99.94%

Above but using nvme_core.default_ps_max_latency_us=0. -2k/-0.4k
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=727MiB/s,w=242MiB/s][r=186k,w=62.0k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2419: Mon Apr  1 16:09:05 2024
  read: IOPS=189k, BW=739MiB/s (775MB/s)(113GiB/155805msec)
   bw (  KiB/s): min=720896, max=777048, per=100.00%, avg=757531.24, stdev=10137.71, samples=311
   iops        : min=180224, max=194262, avg=189382.84, stdev=2534.42, samples=311
  write: IOPS=63.1k, BW=246MiB/s (258MB/s)(37.5GiB/155805msec); 0 zone resets
   bw (  KiB/s): min=238728, max=260384, per=100.00%, avg=252469.48, stdev=3522.78, samples=311
   iops        : min=59682, max=65096, avg=63117.37, stdev=880.69, samples=311
  cpu          : usr=12.71%, sys=74.39%, ctx=9217428, majf=1, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=739MiB/s (775MB/s), 739MiB/s-739MiB/s (775MB/s-775MB/s), io=113GiB (121GB), run=155805-155805msec
  WRITE: bw=246MiB/s (258MB/s), 246MiB/s-246MiB/s (258MB/s-258MB/s), io=37.5GiB (40.3GB), run=155805-155805msec

Disk stats (read/write):
    dm-0: ios=29470271/9822022, merge=0/0, ticks=1095304/69956, in_queue=1165260, util=99.99%, aggrios=29492568/9829469, aggrmerge=0/46, aggrticks=1119861/77973, aggrin_queue=1197872, aggrutil=99.95%
  nvme0n1: ios=29492568/9829469, merge=0/46, ticks=1119861/77973, in_queue=1197872, util=99.95%
Booting from external drive. +7k/+5k

Note this is also a different OS (Fedora desktop vs Ubuntu server).
No nvme_core.default_ps_max_latency_us=0 flag set afaik.

$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.29
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=807MiB/s,w=268MiB/s][r=206k,w=68.5k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=5545: Mon Apr  1 11:20:05 2024
  read: IOPS=206k, BW=805MiB/s (844MB/s)(113GiB/143193msec)
   bw (  KiB/s): min=792296, max=844728, per=100.00%, avg=824277.65, stdev=8860.95, samples=286
   iops        : min=198074, max=211182, avg=206069.44, stdev=2215.25, samples=286
  write: IOPS=68.6k, BW=268MiB/s (281MB/s)(37.5GiB/143193msec); 0 zone resets
   bw (  KiB/s): min=263120, max=283256, per=100.00%, avg=274714.52, stdev=3272.99, samples=286
   iops        : min=65780, max=70814, avg=68678.63, stdev=818.26, samples=286
  cpu          : usr=9.60%, sys=61.20%, ctx=8753572, majf=0, minf=7
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=805MiB/s (844MB/s), 805MiB/s-805MiB/s (844MB/s-844MB/s), io=113GiB (121GB), run=143193-143193msec
  WRITE: bw=268MiB/s (281MB/s), 268MiB/s-268MiB/s (281MB/s-281MB/s), io=37.5GiB (40.3GB), run=143193-143193msec

Disk stats (read/write):
    dm-0: ios=29475893/9823846, merge=0/0, ticks=1120298/73131, in_queue=1193429, util=99.98%, aggrios=29492326/9829358, aggrmerge=0/28, aggrticks=1134762/75903, aggrin_queue=1210697, aggrutil=99.96%
  nvme0n1: ios=29492326/9829358, merge=0/28, ticks=1134762/75903, in_queue=1210697, util=99.96%

@SnoepNFTs
Copy link

Standard setup Ubuntu 22 server, LVM2 - Ext4 root partition.

Thanks for sharing! Appreciate it! Also funny that the nvme_core.default_ps_max_latency_us=0 also slightly improved your performance. We can maybe add this modest improvement at the top of the .md? @yorickdowne

@Beanow
Copy link

Beanow commented Apr 1, 2024

@SnoepNFTs likewise! :] interesting to compare.

For the kernel flag, I retested with a newer host (PCIe4 vs PCIe3, more ram, etc.). It's seemingly worse there.
So I'd probably call this within margin of error or situational.

If I understand right the flag is supposed to lock the power management of the drive to always be full power. Under a synthetic load I'd not expect this to really matter, the load should keep it pretty busy to be full power anyway. Maybe it matters in real workload once you're synced and data trickles in more randomly?

Btw, you mentioned changing systems didn't meaningfully change your iops but I got a pretty decent uplift.
Only thing I haven't tried yet is noatime, since this is the root filesystem need to do partition shuffling for that.

@Beanow
Copy link

Beanow commented Apr 1, 2024

So here's one more PSA.

I tested Ext4, XFS, F2FS and for Ext4 poked at the noatime, even risky ones like data=writeback,barrier=0,nobh ( ⚠️ don't use those). I noticed one particular factor that caused major slowdowns...
Thermal throttling 😂

All of the settings and fs are within margin of error, not enough to care or worry about.
(The default relatime is more than good enough.)

But once I ran enough benchmarks that the SSD was hitting >70 degrees, I lost about 10k read and 3k write iops.
Which came back after letting it cool to about 40 degrees.

I for one am going to keep using Ext4 with relatime, not bothering with a "tuned data partition" of any kind.
But the SSD temps are definitely going to be on my grafana dash.

@0xSileo
Copy link

0xSileo commented Apr 1, 2024

Should the other components of the setups be included in reports ?

The Kingston KC3000 perf is wonderful, but I'd love to know about the rest of the config. cc @kaloyan-raev.

@0xSileo
Copy link

0xSileo commented Apr 7, 2024

Also, has there been any exploration regarding PCIe expansion cards such as the ASUS Hyper M.2 ?

The reasoning being that instead of creating a system with a single 4TB SSD (which usually uses 4 PCIe lanes if M.2), create one with four 1TB SSDs as well as a PCIe expansion card. This would allow for using 16 PCIe lanes, potentially improving IOPS and read/write speeds, for the same price or less.

@yorickdowne
Copy link
Author

The main thing this will accomplish is quadruple the failure rate and make it more difficult to change the size if ever desired, as CloneZilla doesn’t understand lvm or raid0.

Fast enough is fast enough. On a single drive Geth syncs in 6 hours and Nethermind can attest after 2h. Steady state both are well under 500ms for new blocks.

@6ilcarlos
Copy link

M.2 Acer Predator Gm7000 4tb Pcie 4.0 2280 Bl.9bwwr.107

I'm still syncing. Hopefully it works well.

test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.36
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=467MiB/s,w=154MiB/s][r=120k,w=39.4k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28181: Wed May 1 00:40:21 2024
read: IOPS=125k, BW=490MiB/s (514MB/s)(113GiB/235221msec)
bw ( KiB/s): min=332632, max=558096, per=100.00%, avg=501757.57, stdev=43493.30, samples=470
iops : min=83158, max=139524, avg=125439.42, stdev=10873.31, samples=470
write: IOPS=41.8k, BW=163MiB/s (171MB/s)(37.5GiB/235221msec); 0 zone resets
bw ( KiB/s): min=111016, max=187344, per=100.00%, avg=167225.30, stdev=14411.43, samples=470
iops : min=27754, max=46836, avg=41806.32, stdev=3602.86, samples=470
cpu : usr=24.67%, sys=65.27%, ctx=7475608, majf=0, minf=26
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=490MiB/s (514MB/s), 490MiB/s-490MiB/s (514MB/s-514MB/s), io=113GiB (121GB), run=235221-235221msec
WRITE: bw=163MiB/s (171MB/s), 163MiB/s-163MiB/s (171MB/s-171MB/s), io=37.5GiB (40.3GB), run=235221-235221msec

Disk stats (read/write):
nvme0n1: ios=29519460/9866232, sectors=238152720/87094184, merge=4620/7835, ticks=2166481/288104, in_queue=2454704, util=99.99%

@yorickdowne
Copy link
Author

Thanks! Added the Acer. And restructured the lists, it's time to highlight 4TB drives.

@SnoepNFTs
Copy link

SnoepNFTs commented May 4, 2024

Tested SSDs:

  • Corsair MP600 Pro XT - 4TB
  • Transcend 250s - 4TB
  • Sabrent Rocket 4 Plus - 4TB

Managed to get the following results:

Corsair MP600 Pro XT
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=526MiB/s,w=175MiB/s][r=135k,w=44.7k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=26866: Fri May  3 20:16:18 2024
  read: IOPS=138k, BW=540MiB/s (566MB/s)(113GiB/213411msec)
   bw (  KiB/s): min=480704, max=584592, per=100.00%, avg=553012.13, stdev=23966.33, samples=426
   iops        : min=120176, max=146148, avg=138253.06, stdev=5991.58, samples=426
  write: IOPS=46.1k, BW=180MiB/s (189MB/s)(37.5GiB/213411msec); 0 zone resets
   bw (  KiB/s): min=159992, max=197048, per=100.00%, avg=184307.70, stdev=8070.43, samples=426
   iops        : min=39998, max=49262, avg=46076.92, stdev=2017.61, samples=426
  cpu          : usr=25.22%, sys=63.92%, ctx=6327805, majf=0, minf=9
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=540MiB/s (566MB/s), 540MiB/s-540MiB/s (566MB/s-566MB/s), io=113GiB (121GB), run=213411-213411msec
  WRITE: bw=180MiB/s (189MB/s), 180MiB/s-180MiB/s (189MB/s-189MB/s), io=37.5GiB (40.3GB), run=213411-213411msec

Disk stats (read/write):
  nvme0n1: ios=29464619/9825328, merge=30/9642, ticks=1864385/201635, in_queue=2066418, util=100.00%
Transcend 250s
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=498MiB/s,w=164MiB/s][r=128k,w=42.1k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=3003: Fri May  3 21:02:14 2024
  read: IOPS=127k, BW=496MiB/s (520MB/s)(113GiB/232211msec)
   bw (  KiB/s): min=491696, max=518656, per=100.00%, avg=508285.19, stdev=3846.90, samples=464
   iops        : min=122924, max=129664, avg=127071.33, stdev=961.71, samples=464
  write: IOPS=42.3k, BW=165MiB/s (173MB/s)(37.5GiB/232211msec); 0 zone resets
   bw (  KiB/s): min=162568, max=173224, per=100.00%, avg=169400.91, stdev=1520.09, samples=464
   iops        : min=40642, max=43306, avg=42350.23, stdev=380.02, samples=464
  cpu          : usr=26.62%, sys=62.23%, ctx=9407203, majf=0, minf=7
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=496MiB/s (520MB/s), 496MiB/s-496MiB/s (520MB/s-520MB/s), io=113GiB (121GB), run=232211-232211msec
  WRITE: bw=165MiB/s (173MB/s), 165MiB/s-165MiB/s (173MB/s-173MB/s), io=37.5GiB (40.3GB), run=232211-232211msec

Disk stats (read/write):
  nvme0n1: ios=29460827/9818880, merge=0/88, ticks=2052514/99620, in_queue=2152149, util=100.00%
Sabrent Rocket 4 Plus
$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=0): [f(1)][100.0%][r=572MiB/s,w=189MiB/s][r=146k,w=48.4k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=4205: Fri May  3 22:08:09 2024
  read: IOPS=149k, BW=581MiB/s (609MB/s)(113GiB/198335msec)
   bw (  KiB/s): min=511504, max=605560, per=100.00%, avg=595266.65, stdev=7415.77, samples=395
   iops        : min=127876, max=151392, avg=148816.49, stdev=1853.98, samples=395
  write: IOPS=49.6k, BW=194MiB/s (203MB/s)(37.5GiB/198335msec); 0 zone resets
   bw (  KiB/s): min=170387, max=203016, per=100.00%, avg=198395.15, stdev=2661.96, samples=395
   iops        : min=42596, max=50756, avg=49598.60, stdev=665.53, samples=395
  cpu          : usr=25.12%, sys=63.64%, ctx=6134724, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=581MiB/s (609MB/s), 581MiB/s-581MiB/s (609MB/s-609MB/s), io=113GiB (121GB), run=198335-198335msec
  WRITE: bw=194MiB/s (203MB/s), 194MiB/s-194MiB/s (203MB/s-203MB/s), io=37.5GiB (40.3GB), run=198335-198335msec

Disk stats (read/write):
  nvme0n1: ios=29457478/9818010, merge=0/326, ticks=2190413/194778, in_queue=2385235, util=99.99%
 

Additional Notes:

  • Used nvme_core.default_ps_max_latency_us=0 and noatime
  • Transcend 250s gets surprisingly hot, you would definitely need to include a good heatsink if you would want to use this one
  • The heatsink that you can buy with the Sabrent Rocket 4 Plus is really, really good. It is quite an expensive one but I must say that I was surprised with its quality. If you have a performant SSD, but it is quickly thermally throttling with no room for ventilation, then this is the one I would recommend. From all the SSDs I have tested so far it is without a doubt the best heatsink for keeping your SSD cool.
  • Sabrent Rocket SSD performance remained extremely stable throughout the test. Possibily due to its very good heatsink in which potential issues caused by thermals is completely eliminated

@2slow2
Copy link

2slow2 commented May 9, 2024

Kingston FURY Renegade PCIe 4.0 NVMe M.2 SFYRD/4000G

Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=1496MiB/s,w=497MiB/s][r=383k,w=127k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=100833: Thu May 9 03:39:06 2024
read: IOPS=373k, BW=1456MiB/s (1527MB/s)(113GiB/79135msec)
bw ( MiB/s): min= 1147, max= 1535, per=100.00%, avg=1456.99, stdev=74.59, samples=158
iops : min=293656, max=393070, avg=372990.41, stdev=19094.55, samples=158
write: IOPS=124k, BW=485MiB/s (509MB/s)(37.5GiB/79135msec); 0 zone resets
bw ( KiB/s): min=389008, max=526672, per=100.00%, avg=497239.96, stdev=25459.62, samples=158
iops : min=97252, max=131668, avg=124309.99, stdev=6364.90, samples=158
cpu : usr=18.17%, sys=68.64%, ctx=3945070, majf=0, minf=7
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=1456MiB/s (1527MB/s), 1456MiB/s-1456MiB/s (1527MB/s-1527MB/s), io=113GiB (121GB), run=79135-79135msec
WRITE: bw=485MiB/s (509MB/s), 485MiB/s-485MiB/s (509MB/s-509MB/s), io=37.5GiB (40.3GB), run=79135-79135msec

Disk stats (read/write):
nvme0n1: ios=29443600/9812877, merge=0/257, ticks=2158493/108358, in_queue=2266859, util=99.96%

@kinsi55
Copy link

kinsi55 commented May 22, 2024

Besu + Nimbus works completely flawlessly for me with a 2TB MX500, consistent 99+% Effectiveness.

@bdermody
Copy link

bdermody commented Jun 5, 2024

These are mostly internal SSDs! Does anyone know if its is pretty much the same to use an external with 4TB instead of internal? Can someone recommend one Thank you so much!!

@yorickdowne
Copy link
Author

People have had success with thunderbolt to NVMe enclosures, but it’s obviously much much slower than internal.

There isn’t a really good reason to choose external. We mostly hear it from macOS users but it’s actually far cheaper to buy a used x64 miniPC and put an internal drive in, than try to press an old MacBook into service.

@flnnhuman
Copy link

kc3000 2tb work perfect, 99% efficiency for a few weeks, temperature is 50 without any rad

@bdermody
Copy link

bdermody commented Jun 6, 2024

@flnnhuman @yorickdowne

Thanks you two!! I am thinking about this! My computer is slowly dying I think so I am am buying a new powerful one for work! I think I will buy a internal drive and get it installed on my current computer and use my new one for work! Or might just try and external one even if its slower! This is for my own personal interest and learning wise so I dont neeeed really good efficiency! :)

I will keep thunderbolt /NVMe enclosures and kc3000 in mind...

@sebastiandanconia
Copy link

Crucial MX500 2TB 3D NAND SATA, P/N CT2000MX500SSD1:

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75; rm test
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.36
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=170MiB/s,w=55.8MiB/s][r=43.5k,w=14.3k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=58013: Tue Aug  6 20:41:22 2024
  read: IOPS=40.1k, BW=157MiB/s (164MB/s)(113GiB/735227msec)
   bw (  KiB/s): min=56120, max=178656, per=100.00%, avg=160549.40, stdev=16317.69, samples=1470
   iops        : min=14030, max=44664, avg=40137.29, stdev=4079.44, samples=1470
  write: IOPS=13.4k, BW=52.2MiB/s (54.8MB/s)(37.5GiB/735227msec); 0 zone resets
   bw (  KiB/s): min=18936, max=59840, per=100.00%, avg=53507.94, stdev=5464.86, samples=1470
   iops        : min= 4734, max=14960, avg=13376.91, stdev=1366.21, samples=1470
  cpu          : usr=11.35%, sys=37.07%, ctx=20912690, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=157MiB/s (164MB/s), 157MiB/s-157MiB/s (164MB/s-164MB/s), io=113GiB (121GB), run=735227-735227msec
  WRITE: bw=52.2MiB/s (54.8MB/s), 52.2MiB/s-52.2MiB/s (54.8MB/s-54.8MB/s), io=37.5GiB (40.3GB), run=735227-735227msec

Disk stats (read/write):
    dm-0: ios=29488516/9828823, sectors=235908176/78630008, merge=0/0, ticks=33612141/11641385, in_queue=45253526, util=100.00%, aggrios=29483985/9828644, aggsectors=235938760/78640656, aggrmerge=8354/1510, aggrticks=33412161/11652208, aggrin_queue=45069154, aggrutil=100.00%
  sda: ios=29483985/9828644, sectors=235938760/78640656, merge=8354/1510, ticks=33412161/11652208, in_queue=45069154, util=100.00%

@dbeal-eth
Copy link

4x SAMSUNG MZQL27T6HBLA-00A07 8TB

on LVM RAID0

test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=418MiB/s,w=138MiB/s][r=107k,w=35.2k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1791624: Fri Sep 20 06:50:44 2024
  read: IOPS=106k, BW=415MiB/s (435MB/s)(113GiB/277427msec)
   bw (  KiB/s): min=412601, max=433600, per=100.00%, avg=425912.77, stdev=2952.66, samples=554
   iops        : min=103150, max=108400, avg=106478.01, stdev=738.21, samples=554
  write: IOPS=35.4k, BW=138MiB/s (145MB/s)(37.5GiB/277427msec); 0 zone resets
   bw (  KiB/s): min=137098, max=145432, per=100.00%, avg=141948.63, stdev=1246.79, samples=554
   iops        : min=34274, max=36358, avg=35486.98, stdev=311.72, samples=554
  cpu          : usr=14.71%, sys=73.15%, ctx=8764196, majf=0, minf=10
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=415MiB/s (435MB/s), 415MiB/s-415MiB/s (435MB/s-435MB/s), io=113GiB (121GB), run=277427-277427msec
  WRITE: bw=138MiB/s (145MB/s), 138MiB/s-138MiB/s (145MB/s-145MB/s), io=37.5GiB (40.3GB), run=277427-277427msec

Disk stats (read/write):
    dm-6: ios=29470675/9822052, merge=0/0, ticks=2172516/114900, in_queue=2287416, util=100.00%, aggrios=7373081/2457323, aggrmerge=0/0, aggrticks=537561/26680, aggrin_queue=564241, aggrutil=100.00%
    dm-4: ios=7374797/2455609, merge=0/0, ticks=536436/26820, in_queue=563256, util=100.00%, aggrios=7375991/2455677, aggrmerge=431/31, aggrticks=538438/28509, aggrin_queue=566947, aggrutil=100.00%
  nvme2n1: ios=7375991/2455677, merge=431/31, ticks=538438/28509, in_queue=566947, util=100.00%
    dm-2: ios=7372459/2457946, merge=0/0, ticks=538468/27240, in_queue=565708, util=100.00%, aggrios=7431982/2460076, aggrmerge=450/55, aggrticks=546460/28955, aggrin_queue=575442, aggrutil=100.00%
  nvme0n1: ios=7431982/2460076, merge=450/55, ticks=546460/28955, in_queue=575442, util=100.00%
    dm-5: ios=7373249/2457155, merge=0/0, ticks=538392/26592, in_queue=564984, util=100.00%, aggrios=7374415/2457222, aggrmerge=421/35, aggrticks=538730/28707, aggrin_queue=567438, aggrutil=100.00%
  nvme3n1: ios=7374415/2457222, merge=421/35, ticks=538730/28707, in_queue=567438, util=100.00%
    dm-3: ios=7371821/2458585, merge=0/0, ticks=536948/26068, in_queue=563016, util=100.00%, aggrios=7372991/2458649, aggrmerge=454/28, aggrticks=538861/28691, aggrin_queue=567552, aggrutil=100.00%
  nvme1n1: ios=7372991/2458649, merge=454/28, ticks=538861/28691, in_queue=567552, util=100.00%

I also have a Samsung 990 Pro w/ Heatsink on my laptop, running the same test was same results as SnoepNFT https://gist.github.com/yorickdowne/f3a3e79a573bf35767cd002cc977b038?permalink_comment_id=4958391#gistcomment-4958391

@h0m3us3r
Copy link

Solidigm P44 Pro 2TB

I get quite a bit better results on my P44 Pro 2TB (250k/83k; zero tuning):

$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.36
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=838MiB/s,w=279MiB/s][r=215k,w=71.4k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=12922: Sun Sep 29 08:48:05 2024
  read: IOPS=250k, BW=977MiB/s (1025MB/s)(113GiB/117867msec)
   bw (  KiB/s): min=809536, max=1110872, per=100.00%, avg=1001964.77, stdev=99876.67, samples=235
   iops        : min=202384, max=277718, avg=250491.18, stdev=24969.18, samples=235
  write: IOPS=83.4k, BW=326MiB/s (342MB/s)(37.5GiB/117867msec); 0 zone resets
   bw (  KiB/s): min=271720, max=371000, per=100.00%, avg=333941.07, stdev=33374.23, samples=235
   iops        : min=67930, max=92750, avg=83485.27, stdev=8343.57, samples=235
  cpu          : usr=15.19%, sys=69.35%, ctx=8336094, majf=0, minf=8
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=977MiB/s (1025MB/s), 977MiB/s-977MiB/s (1025MB/s-1025MB/s), io=113GiB (121GB), run=117867-117867msec
  WRITE: bw=326MiB/s (342MB/s), 326MiB/s-326MiB/s (342MB/s-342MB/s), io=37.5GiB (40.3GB), run=117867-117867msec

Disk stats (read/write):
  nvme0n1: ios=29461506/9818976, sectors=235692056/78552008, merge=0/25, ticks=3013987/57806, in_queue=3071817, util=71.84%

@Saraeutsza
Copy link

.

@Lexazan
Copy link

Lexazan commented Oct 21, 2024

ADATA XPG GAMMIX S70 BLADE 4TB (AGAMMIXS70B-4T-CS)
301k / 100k

Got it from Amazon USA in october 2024
Tested on framework laptop 13 (11 gen Intel), on empty ex4 drive, booted from usb

DRAM chip was a bit hot, so looks like heatsink and a cooler might be a good idea.

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75; rm test
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=1176MiB/s,w=394MiB/s][r=301k,w=101k IOPS][eta 00m:00s] 
test: (groupid=0, jobs=1): err= 0: pid=3884: Mon Oct 21 07:46:53 2024
  read: IOPS=301k, BW=1176MiB/s (1234MB/s)(113GiB/97929msec)
   bw (  MiB/s): min=    1, max= 1307, per=100.00%, avg=1177.29, stdev=110.59, samples=195
   iops        : min=  389, max=334608, avg=301386.99, stdev=28309.82, samples=195
  write: IOPS=100k, BW=392MiB/s (411MB/s)(37.5GiB/97929msec); 0 zone resets
   bw (  KiB/s): min=97520, max=449912, per=100.00%, avg=403870.42, stdev=24508.67, samples=194
   iops        : min=24380, max=112478, avg=100967.57, stdev=6127.17, samples=194
  cpu          : usr=18.97%, sys=65.46%, ctx=8422129, majf=0, minf=6
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=1176MiB/s (1234MB/s), 1176MiB/s-1176MiB/s (1234MB/s-1234MB/s), io=113GiB (121GB), run=97929-97929msec
  WRITE: bw=392MiB/s (411MB/s), 392MiB/s-392MiB/s (411MB/s-411MB/s), io=37.5GiB (40.3GB), run=97929-97929msec

Disk stats (read/write):
  nvme0n1: ios=29458080/9817976, merge=0/34, ticks=1449679/142800, in_queue=1592483, util=99.51%

@SnoepNFTs
Copy link

Tested SSD:

  • Emtec X400-10 Power pro

Managed to get the following results:

Emtec X400-10 Power pro
fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.36
Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][99.8%][r=324MiB/s,w=107MiB/s][r=82.9k,w=27.4k IOPS][eta 00m:01s] 
test: (groupid=0, jobs=1): err= 0: pid=3257: Wed Nov  6 20:46:37 2024
  read: IOPS=60.0k, BW=234MiB/s (246MB/s)(113GiB/491712msec)
   bw (  KiB/s): min=165432, max=366584, per=99.97%, avg=239835.00, stdev=15147.15, samples=982
   iops        : min=41358, max=91646, avg=59958.75, stdev=3786.78, samples=982
  write: IOPS=20.0k, BW=78.1MiB/s (81.9MB/s)(37.5GiB/491712msec); 0 zone resets
   bw (  KiB/s): min=53960, max=120912, per=99.97%, avg=79933.04, stdev=5104.43, samples=982
   iops        : min=13490, max=30228, avg=19983.25, stdev=1276.11, samples=982
  cpu          : usr=18.32%, sys=41.47%, ctx=8479778, majf=0, minf=9
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=234MiB/s (246MB/s), 234MiB/s-234MiB/s (246MB/s-246MB/s), io=113GiB (121GB), run=491712-491712msec
  WRITE: bw=78.1MiB/s (81.9MB/s), 78.1MiB/s-78.1MiB/s (81.9MB/s-81.9MB/s), io=37.5GiB (40.3GB), run=491712-491712msec

Disk stats (read/write):
  nvme0n1: ios=29465216/9820893, sectors=235723592/78573680, merge=0/595, ticks=27622647/606658, in_queue=28233556, util=63.66%

Additional Notes:

  • A non-mainstream drive and therefore a cheap SSD (especially second hand)
  • Gets burning hot under load

@kewlfft
Copy link

kewlfft commented Nov 10, 2024

Kingston FURY Renegade 4TB, PCIe 3.0

Heat

The temperature under load, as displayed by nvme smart-log /dev/nvme0, was reaching 80°C without heatsink, I bought a basic low profile heatsink for $8 (Thermalright M.2) and now the temperature does not exceed 50°C under heavy load. It's worth it to eliminate the chance of damage or throttling.

btrfs

I am getting 30k read / 10k write IOPS with btrfs. fio seems to be mostly slowed by 100% 1-core CPU utilization of my Intel Core i5 9500T. It could be due to my btrfs setup (defaults, I tested with or without zstd:1 compression which did not make much difference) or the way fio measures performance.

ext4

With ext4, I am now getting 180k read / 60k write (x6 compared to btrfs).

Additional Notes

phoronix also benchmarked btrfs significantly slower than f2fs, ext4 and xfs in some of their Aug'24 tests. Not necessarily a fundamental btrfs issue, it may simply require more tuning.

Conclusion

Just to say that the filesystem and its configuration, and not only noatime, matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment