Skip to content

Instantly share code, notes, and snippets.

@yorickdowne
Last active April 2, 2025 03:22
Show Gist options
  • Save yorickdowne/f3a3e79a573bf35767cd002cc977b038 to your computer and use it in GitHub Desktop.
Save yorickdowne/f3a3e79a573bf35767cd002cc977b038 to your computer and use it in GitHub Desktop.
Great and less great SSDs for Ethereum nodes

Overview

Syncing an Ethereum node is largely reliant on latency and IOPS, I/O Per Second, of the storage. Budget SSDs will struggle to an extent, and some won't be able to sync at all. IOPS can roughly be used as proxy of / predictor for latency. Measuring latency directly is arguably better.

This document aims to snapshot some known good and known bad models.

The drive lists are ordered by interface and then by capacity and alphabetically by vendor name, not by preference. The lists are not exhaustive at all. @mwpastore linked a filterable spreadsheet in comments that has a far greater variety of drives and their characteristics. Filter it by DRAM yes, NAND Type TLC, Form Factor M.2, and desired capacity.

For size, 4TB is a very conservative choice. The smaller 2TB drive should last an Ethereum full node until at least sometime 2026, with the pre-merge history expiry scheduled for May 1st 2025. The Portal team aim to make 2TB last forever with EIP-4444. Remy wrote a migration guide to 4TB.

High-level, QLC and DRAMless are far slower than "mainstream" SSDs. QLC has lower endurance as well. Any savings will be gone when the drive fails early and needs to be replaced.

Other than a slow SSD model, these are things that can slow IOPS down:

  • Heat. Check with smartctl -x; the SSD should be below 50C so it does not throttle.
  • TRIM not being allowed. This can happen with some hardware RAID controllers, as well as on macOS with non-Apple SSDs
  • ZFS, BTRFS, any CoW file system
  • RAID5/6 - write amplification is no joke
  • On SATA, the controller in UEFI/BIOS set to anything other than AHCI. Set it to AHCI for good performance.

If you haven't already, do turn off atime on your DB volume, it'll increase SSD lifetime and speed things up a little bit.

Some users have reported that NUC instability with certain drives can be cured by adding nvme_core.default_ps_max_latency_us=0 pcie_aspm=off to their GRUB_CMDLINE_LINUX_DEFAULT kernel parameters via sudo nano /etc/default/grub and sudo update-grub. This keeps the drive from entering powersave states by itself.

The Good

"Mainstream" and "Performance" drive models that can sync mainnet execution layer clients in a reasonable amount of time.

  • Higher endurance (TBW) than most: Seagate Firecuda 530, WD Red SN700
  • Lowest power draw: SK Hynix P31 Gold - is a great choice for Rock5 B and other low-power devices, but 2TB only

We've started crowd-sourcing some IOPS numbers. If you want to join the fun, run fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=150G --readwrite=randrw --rwmixread=75; rm test and give us the read and write IOPS. During this test, once it's within maybe a minute of finishing, also run sudo ioping -D -c 30 /dev/<ssd-device> where lsblk will show the <ssd-device> - will often be nvme0n1.

Geth is reported to benefit from around 300us (microseconds) latency. Not all clients are created equal, some need faster latency and some will be OK with slower.

For reference, AWS gp3 16k IOPS was measured at ~ 425/615/1300us min/avg/max latency while a Geth was state-healing. A Samsung data center NVMe was measured at 25/75/150us min/avg/max latency with Geth in steady state, and the same hardware during fio measured 25/80/265us min/avg/max latency.

Another good measurement is the newPayloadVx quantile on the execution layer client metrics during normal operation: You'd want the 95th percentile or even 99th percentile to be below 500ms.

If you have room for it and need an excellent heatsink, consider the "Rocket NVMe Heatsink". It is quite high however, and may not fit in some miniPC cases.

Hardware

M.2 NVMe "Mainstream" - TLC, DRAM, PCIe 3, 4TB drives

  • Any data center/enterprise NVMe SSD
  • Teamgroup MP34, between 94k/31k and 118k/39k r/w IOPS
  • WD Red SN700, 141k/47k r/w IOPS

M.2 NVMe "Performance" - TLC, DRAM, PCIe 4 or 5, 4TB drives

  • Any data center/enterprise NVMe SSD
  • Acer GM7000 "Predator", 125k/41k r/w IOPS
  • ADATA XPG Gammix S70, 272k/91k r/w IOPS
  • Corsair Force MP600 Pro and variants (but not "MP600 Core XT"), 138k/46k r/w IOPS
  • Crucial T700, 215k/71k r/w IOPS
  • Kingston KC3000, 377k/126k r/w IOPS
  • Kingston Fury Renegade, 211k/70k r/w IOPS
  • Mushkin Redline Vortex (but not LX)
  • Sabrent Rocket 4 Plus, 149k/49k r/w IOPS. @SnoepNFTs reports the Rocket NVMe Heatsink keeps it very cool.
  • Samsung 990 Pro, 124k/41k r/w IOPS - there are reports of 990 Pro rapidly losing health. A firmware update to 1B2QJXD7 is meant to stop the rapid degradation, but won't reverse any that happened on earlier firmware.
  • Seagate Firecuda 530, 218k/73k r/w IOPS
  • Teamgroup MP44 (but not MP44L or MP44Q), 105k/35k r/w IOPS - caution that this is DRAMless and uses a Host Memory Buffer (HMB), yet appears to perform fine.
  • Transcend 250s, 127k/42k r/w IOPS. @SnoepNFTs reports it gets very hot, you'd want to add a good heatsink to it.
  • WD Black SN850X, 101k/33k r/w IOPS

M.2 NVMe "Mainstream" - TLC, DRAM, PCIe 3, 2TB drives

  • Any data center/enterprise NVMe SSD
  • AData XPG Gammix S11/SX8200 Pro. Several hardware revisions. It's slower than some QLC drives. 68k/22k r/w IOPS
  • AData XPG Gammix S50 Lite
  • HP EX950
  • Mushkin Pilot-E
  • Samsung 970 EVO Plus 2TB, pre-rework (firmware 2B2QEXM7). 140k/46k r/w IOPS
  • Samsung 970 EVO Plus 2TB, post-rework (firmware 3B2QEXM7 or 4B2QEXM7). In testing this syncs just as quickly as the pre-rework drive
  • SK Hynix P31 Gold
  • WD Black SN750 (but not SN750 SE)

M.2 NVMe "Performance" - TLC, DRAM, PCIe 4 or 5, 2TB drives

  • Any data center/enterprise NVMe SSD
  • Crucial P5 Plus
  • Kingston KC2000
  • Samsung 980 Pro (not 980) - a firmware update to 5B2QGXA7 is necessary to keep them from dying, if they are firmware 3B2QGXA7. Samsung's boot Linux is a bit broken, you may want to flash from your own Linux.
  • SK Hynix P41 Platinum / Solidigm P44 Pro, 99k/33k r/w IOPS
  • WD Black SN850

Cloud

  • Any baremetal/dedicated server service
  • AWS i3en.(2)xlarge or is4gen.xlarge

The Bad

These "Budget" drive models are reportedly too slow to sync (all) mainnet execution layer clients.

Hardware

  • AData S40G/SX8100 4TB, QLC - the 2TB model is TLC and should be fine; 4TB is reportedly too slow
  • Crucial P1, QLC - users report it can't sync Nethermind
  • Crucial P2 and P3 (Plus), QLC and DRAMless - users report it can't sync Nethermind, 27k/9k r/w IOPS
  • Kingston NV1 - probably QLC and DRAMless and thus too slow on 2TB, but could be "anything" as Kingston do not guarantee specific components.
  • Kingston NV2 - like NV1 no guaranteed components
  • WD Green SN350, QLC and DRAMless
  • Anything both QLC and DRAMless will likely not be able to sync at all or not be able to consistently keep up with "chain head"
  • Crucial BX500 SATA, HP S650 SATA, probably most SATA budget drives
  • Samsung 980, DRAMless - unsure, this may belong in "Ugly". If you have one and can say for sure, please come to ethstaker Discord.
  • Samsung T7 USB, even with current firmware

The Ugly

"Budget" drive models that reportedly can sync mainnet execution layer clients, if slowly.

Note that QLC drives usually have a markedly lower TBW than TLC, and will fail earlier.

Hardware

  • Corsair MP400, QLC
  • Inland Professional 3D NAND, QLC
  • Intel 660p, QLC. It's faster than some "mainstream" drives. 98k/33k r/w IOPS
  • Seagata Barracuda Q5, QLC
  • WD Black SN770, DRAMless
  • Samsung 870 QVO SATA, QLC

2.5" SATA "Mainstream" - TLC, DRAM

  • These have been moved to "ugly" because there are user reports that only Nimbus/Geth will now sync on SATA, and even that takes 3 days. It looks like after Dencun, NVMe is squarely the way to go.
  • Any data center/enterprise SATA SSD
  • Crucial MX500 SATA, 46k/15k r/w IOPS
  • Samsung 860 EVO SATA, 55k/18k r/w IOPS
  • Samsung 870 EVO SATA, 63k/20k r/w IOPS
  • WD Blue 3D NAND SATA

Cloud

  • AWS gp3 w/ 16k IOPS provisioned and an m7i/a.xlarge - works but latency is about 2x slower than hoped for
  • Netcup RS G11 Servers. Impressively fast; but it still depends on your neighbors in the service.
  • Contabo SSD - reportedly able to sync Geth 1.13.0 and Nethermind, if slowly
  • Netcup VPS Servers - reportedly able to sync Geth 1.13.0 and Nethermind, if slowly
  • Contabo NVMe - fast enough but not enough space. 800 GiB is not sufficient.
@0xthedance
Copy link

I've obtained a bad performance with ORICO O7000 7000MB/s PCIe 4.0 M.2 NVMe SSD

Starting 1 process
test: Laying out IO file (1 file / 153600MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=92.8MiB/s,w=31.0MiB/s][r=23.8k,w=7940 IOPS][eta 00m:01s]
test: (groupid=0, jobs=1): err= 0: pid=771210: Fri Feb 21 23:27:03 2025
  read: IOPS=8574, BW=33.5MiB/s (35.1MB/s)(113GiB/3439539msec)
   bw (  KiB/s): min=  240, max=167528, per=100.00%, avg=34326.06, stdev=33899.66, samples=6871
   iops        : min=   60, max=41882, avg=8581.38, stdev=8474.95, samples=6871
  write: IOPS=2857, BW=11.2MiB/s (11.7MB/s)(37.5GiB/3439539msec); 0 zone resets
   bw (  KiB/s): min=   64, max=55552, per=100.00%, avg=11440.04, stdev=11298.15, samples=6871
   iops        : min=   16, max=13888, avg=2859.87, stdev=2824.55, samples=6871
  cpu          : usr=2.14%, sys=11.72%, ctx=5283064, majf=0, minf=9
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=29492326,9829274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=33.5MiB/s (35.1MB/s), 33.5MiB/s-33.5MiB/s (35.1MB/s-35.1MB/s), io=113GiB (121GB), run=3439539-3439539msec
  WRITE: bw=11.2MiB/s (11.7MB/s), 11.2MiB/s-11.2MiB/s (11.7MB/s-11.7MB/s), io=37.5GiB (40.3GB), run=3439539-3439539msec

Disk stats (read/write):
    dm-0: ios=29486898/9831710, sectors=235895784/78649312, merge=0/0, ticks=30241793/179121489, in_queue=209363282, util=100.00%, aggrios=29492363/9832622, aggsectors=235939504/78664832, aggrmerge=0/1028, aggrticks=30182892/179187256, aggrin_queue=209380926, aggrutil=89.88%
  nvme0n1: ios=29492363/9832622, sectors=235939504/78664832, merge=0/1028, ticks=30182892/179187256, in_queue=209380926, util=89.88%

@galaxy-tbehrens
Copy link

@0xthedance Sounds truly atrocious and shows the performance hit that QLC/DRAMless can impart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment