Please refer to ZFS 2-3x slower than EXT4 to see how ZFS defaults + Postgres defaults severely underperform EXT4 defaults + Postgres defaults (and also to know more about the system on which these benchmarks were performed). This page documents how to tune ZFS + Postgres to give better performance for the tpcb-like benchmark.
Please do not copy these settings blindly because I am myself not clear on why/how these settings had the impact they did. For example, I cannot explain why full_page_writes=off
independently did not give that much boost, nor did an optimized PG configuration. However, putting both of them together gave a 2-4x boost compared to baseline numbers.
You'd want to compare the numbers on the first row and the last row to get a sense of the maximum performance gains that I could achieve.
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| client | 1 | 4 | 8 | 12 | 24 | 48 | 96 |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| EXT4 defaults | 719 | 1816 | 2835 | 3576 | 4531 | 5485 | 6876 |
| cold (warm) | (763) | (1982) | (3286) | (3889) | (5071) | (5604) | (6540) |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| ZFS benchmarks |
+--------------------------------------------------------------------------------------+
| PG defaults | 302 | 808 | 1138 | 1287 | 1619 | 2575 | 4040 |
| cold (warm) | (309) | (836) | (1140) | (1307) | (1651) | (2621) | (4301) |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| Optimized PG conf [1] | 327 | 1365 | 2291 | | | | |
| full_page_write=on | (334) | (1331) | (2049) | | | | |
| cold (warm) | | | | | | | |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| PG defaults | | 878 | | 1334 | | | |
| + kernel tuning [2] | | (882) | | (1363) | | | |
| cold (warm) | | | | | | | |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| PG defaults | | 957 | | 1372 | 2130 | | |
| + full_page_writes=off | | (959) | | (1399) | (2154) | | |
| + kernel tuning [2] | | | | | | | |
| cold (warm) | | | | | | | |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
| Optimized PG conf [1] | | 2335 | | 4751 | 6969 | 7627 | 8429 |
| + full_page_writes=off | | (2488) | | (5072) | (5667) | (6485) | (7720) |
| + kernel tuning [2] | | | | | | | |
| cold (warm) | | | | | | | |
+------------------------+-------+--------+--------+--------+--------+--------+--------+
zpool create -f -m /firstzfs firstzfs -o ashift=12 /dev/nvme1n1p2
zfs create firstzfs/postgres \
-o mountpoint=/firstzfs/postgres \
-o atime=off \
-o canmount=on \
-o compression=lz4 \
-o quota=100G \
-o recordsize=8k \
-o logbias=throughput
- Give more memory to PG for
shared_buffers
- Tweak the IO parameters for SSDs
max_connections = 100
shared_buffers = 16GB
effective_cache_size = 48GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 41943kB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
To be honest, I don't completely understand what this really does, but apparently it has a big positive impact for Postgres + ZFS.
full_page_writes=off
- Set CPU governor to
performance
. I noticed that multiple machines had this set topowersave
instead, which keeps reducing the CPU speed in periods of inactivity, and increases the CPU speed, when required, but only after a slight delay. In an old benchmark (not reported above), I think I saw an improvement of ~5% due to this tweak alone.
cpufreq-set -g performance
- Change drive read-ahead to
4096
. Warning: This is recommended in the bookPostgeSQL 10 - High Performance
, but I am not sure how relevant this is for ZFS, because ZFS has about 10+ (really!) tunable parameters for prefetching of data.
blockdev --setra 4096 /dev/sda
- Change swapiness. Caveat: I haven't even configured swap on the benchmarking system, but I did this anyway, for sake of completeness. Again, another recommendation from
PostgreSQL 10 - High Performance
echo 0 > /proc/sys/vm/swappiness