Skip to content

Instantly share code, notes, and snippets.

@ryogrid
Last active April 6, 2025 23:58
Show Gist options
  • Save ryogrid/90d405110615fc19313a58faec5fe73d to your computer and use it in GitHub Desktop.
Save ryogrid/90d405110615fc19313a58faec5fe73d to your computer and use it in GitHub Desktop.
postgresのio_uringを有効にした時のパフォーマンスを見てみる
注意:まだベータにもなっていない段階のコードベースで、設定も精査せず、ベンチマーククライアントも同一マシンに置いて、とりあえず測定をしてみた、というだけのものなので、その程度のものであり、結果であるというところは、ご承知おき下さい。
●前提情報
■実施日
2025/4/5
■測定環境(自作デスクトップ)
AMD Ryzen 7 5700X 8-Core Processor 4.50 GHz
Memory 64GB
WD_Black SN770 NVMe WDS100T3X0E 1TB M.2 PCI-Express Gen4
(マザボはGen3までのサポート...)
Windows 11 Pro x64 24H2
WSL2
Ubuntu 22.04
- liburingのdebパッケージが20.04ではインストールできても postgres のビルドが失敗したため、 do-release-upgrade コマンドで 22.04 にアップグレードした
ryo@DESKTOP-IOASPN6:~$ uname -a
Linux DESKTOP-IOASPN6 5.15.167.4-microsoft-standard-WSL2 #1 SMP Tue Nov 5 00:21:55 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
ryo@DESKTOP-IOASPN6:~/$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
ryo@DESKTOP-IOASPN6:~/$ sudo apt-cache show liburing2
Package: liburing2
Status: install ok installed
Priority: optional
Section: libs
Installed-Size: 96
Maintainer: Ubuntu Developers <[email protected]>
Architecture: amd64
Multi-Arch: same
Source: liburing
Version: 2.5-1build1
Depends: libc6 (>= 2.4)
Description-en: Linux kernel io_uring access library - shared library
This library provides helpers to setup and teardown io_uring instances,
and also a simplified interface for applications that do not need (or
want) to deal with the full kernel side implementation.
.
This package contains the shared library.
Description-md5: c13d0621ef06a82a3a071dc62bd6e7a3
Homepage: https://github.com/axboe/liburing
Original-Maintainer: Guillem Jover <[email protected]>
ryo@DESKTOP-IOASPN6:~/$ sudo apt-cache show liburing-dev
Package: liburing-dev
Status: install ok installed
Priority: optional
Section: libdevel
Installed-Size: 479
Maintainer: Ubuntu Developers <[email protected]>
Architecture: amd64
Multi-Arch: same
Source: liburing
Version: 2.5-1build1
Depends: liburing2 (= 2.5-1build1), libc6-dev | libc-dev, linux-libc-dev (>= 5.1)
Description-en: Linux kernel io_uring access library - development files
This library provides helpers to setup and teardown io_uring instances,
and also a simplified interface for applications that do not need (or
want) to deal with the full kernel side implementation.
.
This package contains the static library and the header files.
Description-md5: 920173f8b04b2b4fddbbff6162ef6c52
Homepage: https://github.com/axboe/liburing
Original-Maintainer: Guillem Jover <[email protected]>
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ gcc --version
gcc (Ubuntu 11.4.0-2ubuntu1~20.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ g++ --version
g++ (Ubuntu 11.4.0-2ubuntu1~20.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
■測定に利用したコードベース
以下のコミットがHEADである時点のmasterのコードベースを用いた。
commit 5db3bf7391d77ae86bc9b5f580141e022803b744 (HEAD -> master, origin/master, origin/HEAD)
Author: Andrew Dunstan <[email protected]>
Date: Sat Apr 5 08:00:24 2025 -0400
Clean up from commit 1495eff7bdb
Fix some comments, and remove the hacky way of quoting database names in
favor of appendStringLiteralConn.
■ビルドと測定時のコマンドライン他
./configure --with-liburing --prefix=/home/ryo/work/postgres/local_install
max_connection は500に変更
create database benchdb;
pgbench -i -s 2000 benchdb -U postgres
pgbench -c 300 -j 16 -T 600 benchdb -U postgres
スケールファクター 2000
300クライアントが全力でトランザクション投げる(1つ終わったら次)状態を模擬
pgbench自体は16スレッドで動作
測定時間10分
ワークロードはTPC-B(相当のもの)
- 金融機関での入出金をモデルしたSELECT、INSERT、UPDATEから成るショートトランザクション
- writeヘビー
■測定方法
pgbenchとpostgresを同一マシンのWSL2環境で実行
■(ほぼ)デフォルト設定で測定
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -i -s 2000 benchdb -U postgres
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 283.03 s (drop tables 0.00 s, create tables 0.00 s, client-side generate 212.85 s, vacuum 0.66 s, primary keys 69.51 s)
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -c 300 -j 16 -T 600 benchdb -U postgres
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 300
number of threads: 16
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 4518782
number of failed transactions: 0 (0.000%)
latency average = 39.829 ms
initial connection time = 157.999 ms
tps = 7532.117944 (without initial connection time)
●測定結果
■io_uringを有効化して測定
設定されていなかった io_method を io_uring に変更。
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -i -s 2000 benchdb -U postgres
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 371.86 s (drop tables 3.75 s, create tables 0.01 s, client-side generate 273.86 s, vacuum 0.48 s, primary keys 93.76 s).
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -c 300 -j 16 -T 600 benchdb -U postgres
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 300
number of threads: 16
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 4380907
number of failed transactions: 0 (0.000%)
latency average = 41.076 ms
initial connection time = 378.667 ms
tps = 7303.471992 (without initial connection time)
■io_method指定を無しの状態に戻して3回目の測定
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -i -s 2000 benchdb -U postgres
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 335.85 s (drop tables 3.77 s, create tables 0.01 s, client-side generate 250.83 s, vacuum 0.53 s, primary keys 80.72 s).
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -c 300 -j 16 -T 600 benchdb -U postgres
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 300
number of threads: 16
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 4267862
number of failed transactions: 0 (0.000%)
latency average = 42.187 ms
initial connection time = 133.108 ms
tps = 7111.139640 (without initial connection time)
■io_methodの指定無しのまま checkpoint_flush_after = 0、checkpoint_timeout = 1d、 max_wal_size = 1000GBとしてチェックポイント止める
注: 初期化時は設定変更前だったのでチェックポイントが起きている
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -i -s 2000 benchdb -U postgres
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 385.32 s (drop tables 3.79 s, create tables 0.01 s, client-side generate 292.35 s, vacuum 0.61 s, primary keys 88.56 s).
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -c 300 -j 16 -T 600 benchdb -U postgres
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 300
number of threads: 16
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 5117739
number of failed transactions: 0 (0.000%)
latency average = 35.309 ms
initial connection time = 155.324 ms
tps = 8496.444458 (without initial connection time)
■チェックポイントを止めたまま io_method を io_uring へ
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -i -s 2000 benchdb -U postgres
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 436.67 s (drop tables 4.33 s, create tables 0.02 s, client-side generate 335.94 s, vacuum 0.54 s, primary keys 95.84 s).
ryo@DESKTOP-IOASPN6:~/work/postgres/local_install$ bin/pgbench -c 300 -j 16 -T 600 benchdb -U postgres
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 300
number of threads: 16
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 6802246
number of failed transactions: 0 (0.000%)
latency average = 26.702 ms
initial connection time = 366.445 ms
tps = 11235.286581 (without initial connection time)
# ***最後の測定の時のもの***
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# (The "=" is optional.) Whitespace may be used. Comments are introduced with
# "#" anywhere on a line. The complete list of parameter names and allowed
# values can be found in the PostgreSQL documentation.
#
# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, run "pg_ctl reload", or execute
# "SELECT pg_reload_conf()". Some parameters, which are marked below,
# require a server shutdown and restart to take effect.
#
# Any parameter can also be given as a command-line option to the server, e.g.,
# "postgres -c log_connections=all". Some parameters can be changed at run time
# with the "SET" SQL command.
#
# Memory units: B = bytes Time units: us = microseconds
# kB = kilobytes ms = milliseconds
# MB = megabytes s = seconds
# GB = gigabytes min = minutes
# TB = terabytes h = hours
# d = days
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------
# The default values of these variables are driven from the -D command-line
# option or PGDATA environment variable, represented here as ConfigDir.
#data_directory = 'ConfigDir' # use data in another directory
# (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
# (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
# (change requires restart)
# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = '' # write an extra PID file
# (change requires restart)
#------------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#------------------------------------------------------------------------------
# - Connection Settings -
#listen_addresses = 'localhost' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
#port = 5432 # (change requires restart)
max_connections = 500 # (change requires restart)
#reserved_connections = 0 # (change requires restart)
#superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directories = '/tmp' # comma-separated list of directories
# (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
# (change requires restart)
#bonjour = off # advertise server via Bonjour
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)
# - TCP settings -
# see "man tcp" for details
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
# 0 selects the system default
#client_connection_check_interval = 0 # time between checks for client
# disconnection while running queries;
# 0 for never
# - Authentication -
#authentication_timeout = 1min # 1s-600s
#password_encryption = scram-sha-256 # scram-sha-256 or md5
#scram_iterations = 4096
#md5_password_warnings = on
# GSSAPI using Kerberos
#krb_server_keyfile = 'FILE:${sysconfdir}/krb5.keytab'
#krb_caseins_users = off
#gss_accept_delegation = off
# - SSL -
#ssl = off
#ssl_ca_file = ''
#ssl_cert_file = 'server.crt'
#ssl_crl_file = ''
#ssl_crl_dir = ''
#ssl_key_file = 'server.key'
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed TLSv1.2 ciphers
#ssl_tls13_ciphers = '' # allowed TLSv1.3 cipher suites, blank for default
#ssl_prefer_server_ciphers = on
#ssl_groups = 'X25519:prime256v1'
#ssl_min_protocol_version = 'TLSv1.2'
#ssl_max_protocol_version = ''
#ssl_dh_params_file = ''
#ssl_passphrase_command = ''
#ssl_passphrase_command_supports_reload = off
# OAuth
#oauth_validator_libraries = '' # comma-separated list of trusted validator modules
#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------
# - Memory -
shared_buffers = 128MB # min 128kB
# (change requires restart)
#huge_pages = try # on, off, or try
# (change requires restart)
#huge_page_size = 0 # zero for system default
# (change requires restart)
#temp_buffers = 8MB # min 800kB
#max_prepared_transactions = 0 # zero disables the feature
# (change requires restart)
# Caution: it is not advisable to set max_prepared_transactions nonzero unless
# you actively intend to use prepared transactions.
#work_mem = 4MB # min 64kB
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
# supported by the operating system:
# mmap
# sysv
# windows
# (change requires restart)
dynamic_shared_memory_type = posix # the default is usually the first option
# supported by the operating system:
# posix
# sysv
# windows
# mmap
# (change requires restart)
#min_dynamic_shared_memory = 0MB # (change requires restart)
#vacuum_buffer_usage_limit = 2MB # size of vacuum and analyze buffer access strategy ring;
# 0 to disable vacuum buffer access strategy;
# range 128kB to 16GB
# SLRU buffers (change requires restart)
#commit_timestamp_buffers = 0 # memory for pg_commit_ts (0 = auto)
#multixact_offset_buffers = 16 # memory for pg_multixact/offsets
#multixact_member_buffers = 32 # memory for pg_multixact/members
#notify_buffers = 16 # memory for pg_notify
#serializable_buffers = 32 # memory for pg_serial
#subtransaction_buffers = 0 # memory for pg_subtrans (0 = auto)
#transaction_buffers = 0 # memory for pg_xact (0 = auto)
# - Disk -
#temp_file_limit = -1 # limits per-process temp file space
# in kilobytes, or -1 for no limit
#max_notify_queue_pages = 1048576 # limits the number of SLRU pages allocated
# for NOTIFY / LISTEN queue
# - Kernel Resources -
#max_files_per_process = 1000 # min 64
# (change requires restart)
# - Background Writer -
#bgwriter_delay = 200ms # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100 # max buffers written/round, 0 disables
#bgwriter_lru_multiplier = 2.0 # 0-10.0 multiplier on buffers scanned/round
#bgwriter_flush_after = 512kB # measured in pages, 0 disables
# - I/O -
#backend_flush_after = 0 # measured in pages, 0 disables
#effective_io_concurrency = 16 # 1-1000; 0 disables issuing multiple simultaneous IO requests
#maintenance_io_concurrency = 16 # 1-1000; same as effective_io_concurrency
#io_max_combine_limit = 128kB # usually 1-128 blocks (depends on OS)
# (change requires restart)
#io_combine_limit = 128kB # usually 1-128 blocks (depends on OS)
#io_method = worker # worker, io_uring, sync
# (change requires restart)
io_method = io_uring
#io_max_concurrency = -1 # Max number of IOs that one process
# can execute simultaneously
# -1 sets based on shared_buffers
# (change requires restart)
#io_workers = 3 # 1-32;
# - Worker Processes -
#max_worker_processes = 8 # (change requires restart)
#max_parallel_workers_per_gather = 2 # limited by max_parallel_workers
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
#parallel_leader_participation = on
#------------------------------------------------------------------------------
# WRITE-AHEAD LOG
#------------------------------------------------------------------------------
# - Settings -
#wal_level = replica # minimal, replica, or logical
# (change requires restart)
#fsync = on # flush data to disk for crash safety
# (turning this off can cause
# unrecoverable data corruption)
#synchronous_commit = on # synchronization level;
# off, local, remote_write, remote_apply, or on
#wal_sync_method = fsync # the default is the first option
# supported by the operating system:
# open_datasync
# fdatasync (default on Linux and FreeBSD)
# fsync
# fsync_writethrough
# open_sync
#full_page_writes = on # recover from partial page writes
#wal_log_hints = off # also do full page writes of non-critical updates
# (change requires restart)
#wal_compression = off # enables compression of full-page writes;
# off, pglz, lz4, zstd, or on
#wal_init_zero = on # zero-fill new WAL files
#wal_recycle = on # recycle WAL files
#wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
# (change requires restart)
#wal_writer_delay = 200ms # 1-10000 milliseconds
#wal_writer_flush_after = 1MB # measured in pages, 0 disables
#wal_skip_threshold = 2MB
#commit_delay = 0 # range 0-100000, in microseconds
#commit_siblings = 5 # range 1-1000
# - Checkpoints -
#checkpoint_timeout = 5min # range 30s-1d
checkpoint_timeout = 1d
#checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
#checkpoint_flush_after = 256kB # measured in pages, 0 disables
checkpoint_flush_after = 0
#checkpoint_warning = 30s # 0 disables
#max_wal_size = 1GB
max_wal_size = 1000GB
min_wal_size = 80MB
# - Prefetching during recovery -
#recovery_prefetch = try # prefetch pages referenced in the WAL?
#wal_decode_buffer_size = 512kB # lookahead window used for prefetching
# (change requires restart)
# - Archiving -
#archive_mode = off # enables archiving; off, on, or always
# (change requires restart)
#archive_library = '' # library to use to archive a WAL file
# (empty string indicates archive_command should
# be used)
#archive_command = '' # command to use to archive a WAL file
# placeholders: %p = path of file to archive
# %f = file name only
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0 # force a WAL file switch after this
# number of seconds; 0 disables
# - Archive Recovery -
# These are only used in recovery mode.
#restore_command = '' # command to use to restore an archived WAL file
# placeholders: %p = path of file to restore
# %f = file name only
# e.g. 'cp /mnt/server/archivedir/%f %p'
#archive_cleanup_command = '' # command to execute at every restartpoint
#recovery_end_command = '' # command to execute at completion of recovery
# - Recovery Target -
# Set these only when performing a targeted recovery.
#recovery_target = '' # 'immediate' to end recovery as soon as a
# consistent state is reached
# (change requires restart)
#recovery_target_name = '' # the named restore point to which recovery will proceed
# (change requires restart)
#recovery_target_time = '' # the time stamp up to which recovery will proceed
# (change requires restart)
#recovery_target_xid = '' # the transaction ID up to which recovery will proceed
# (change requires restart)
#recovery_target_lsn = '' # the WAL LSN up to which recovery will proceed
# (change requires restart)
#recovery_target_inclusive = on # Specifies whether to stop:
# just after the specified recovery target (on)
# just before the recovery target (off)
# (change requires restart)
#recovery_target_timeline = 'latest' # 'current', 'latest', or timeline ID
# (change requires restart)
#recovery_target_action = 'pause' # 'pause', 'promote', 'shutdown'
# (change requires restart)
# - WAL Summarization -
#summarize_wal = off # run WAL summarizer process?
#wal_summary_keep_time = '10d' # when to remove old summary files, 0 = never
#------------------------------------------------------------------------------
# REPLICATION
#------------------------------------------------------------------------------
# - Sending Servers -
# Set these on the primary and on any standby that will send replication data.
#max_wal_senders = 10 # max number of walsender processes
# (change requires restart)
#max_replication_slots = 10 # max number of replication slots
# (change requires restart)
#wal_keep_size = 0 # in megabytes; 0 disables
#max_slot_wal_keep_size = -1 # in megabytes; -1 disables
#idle_replication_slot_timeout = 0 # in minutes; 0 disables
#wal_sender_timeout = 60s # in milliseconds; 0 disables
#track_commit_timestamp = off # collect timestamp of transaction commit
# (change requires restart)
# - Primary Server -
# These settings are ignored on a standby server.
#synchronous_standby_names = '' # standby servers that provide sync rep
# method to choose sync standbys, number of sync standbys,
# and comma-separated list of application_name
# from standby(s); '*' = all
#synchronized_standby_slots = '' # streaming replication standby server slot
# names that logical walsender processes will wait for
# - Standby Servers -
# These settings are ignored on a primary server.
#primary_conninfo = '' # connection string to sending server
#primary_slot_name = '' # replication slot on sending server
#hot_standby = on # "off" disallows queries during recovery
# (change requires restart)
#max_standby_archive_delay = 30s # max delay before canceling queries
# when reading WAL from archive;
# -1 allows indefinite delay
#max_standby_streaming_delay = 30s # max delay before canceling queries
# when reading streaming WAL;
# -1 allows indefinite delay
#wal_receiver_create_temp_slot = off # create temp slot if primary_slot_name
# is not set
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
# query conflicts
#wal_receiver_timeout = 60s # time that receiver waits for
# communication from primary
# in milliseconds; 0 disables
#wal_retrieve_retry_interval = 5s # time to wait before retrying to
# retrieve WAL after a failed attempt
#recovery_min_apply_delay = 0 # minimum delay for applying changes during recovery
#sync_replication_slots = off # enables slot synchronization on the physical standby from the primary
# - Subscribers -
# These settings are ignored on a publisher.
#max_active_replication_origins = 10 # max number of active replication origins
# (change requires restart)
#max_logical_replication_workers = 4 # taken from max_worker_processes
# (change requires restart)
#max_sync_workers_per_subscription = 2 # taken from max_logical_replication_workers
#max_parallel_apply_workers_per_subscription = 2 # taken from max_logical_replication_workers
#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------
# - Planner Method Configuration -
#enable_async_append = on
#enable_bitmapscan = on
#enable_gathermerge = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_incremental_sort = on
#enable_indexscan = on
#enable_indexonlyscan = on
#enable_material = on
#enable_memoize = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_parallel_append = on
#enable_parallel_hash = on
#enable_partition_pruning = on
#enable_partitionwise_join = off
#enable_partitionwise_aggregate = off
#enable_presorted_aggregate = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on
#enable_group_by_reordering = on
#enable_distinct_reordering = on
# - Planner Cost Constants -
#seq_page_cost = 1.0 # measured on an arbitrary scale
#random_page_cost = 4.0 # same scale as above
#cpu_tuple_cost = 0.01 # same scale as above
#cpu_index_tuple_cost = 0.005 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above
#parallel_setup_cost = 1000.0 # same scale as above
#parallel_tuple_cost = 0.1 # same scale as above
#min_parallel_table_scan_size = 8MB
#min_parallel_index_scan_size = 512kB
#effective_cache_size = 4GB
#jit_above_cost = 100000 # perform JIT compilation if available
# and query more expensive than this;
# -1 disables
#jit_inline_above_cost = 500000 # inline small functions if query is
# more expensive than this; -1 disables
#jit_optimize_above_cost = 500000 # use expensive JIT optimizations if
# query is more expensive than this;
# -1 disables
# - Genetic Query Optimizer -
#geqo = on
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0 # selects default based on effort
#geqo_selection_bias = 2.0 # range 1.5-2.0
#geqo_seed = 0.0 # range 0.0-1.0
# - Other Planner Options -
#default_statistics_target = 100 # range 1-10000
#constraint_exclusion = partition # on, off, or partition
#cursor_tuple_fraction = 0.1 # range 0.0-1.0
#from_collapse_limit = 8
#jit = on # allow JIT compilation
#join_collapse_limit = 8 # 1 disables collapsing of explicit
# JOIN clauses
#plan_cache_mode = auto # auto, force_generic_plan or
# force_custom_plan
#recursive_worktable_factor = 10.0 # range 0.001-1000000
#------------------------------------------------------------------------------
# REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
#log_destination = 'stderr' # Valid values are combinations of
# stderr, csvlog, jsonlog, syslog, and
# eventlog, depending on platform.
# csvlog and jsonlog require
# logging_collector to be on.
# This is used when logging to stderr:
#logging_collector = off # Enable capturing of stderr, jsonlog,
# and csvlog into log files. Required
# to be on for csvlogs and jsonlogs.
# (change requires restart)
# These are only used if logging_collector is on:
#log_directory = 'log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_file_mode = 0600 # creation mode for log files,
# begin with 0 to use octal notation
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.
#log_truncate_on_rotation = off # If on, an existing log file with the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
#syslog_sequence_numbers = on
#syslog_split_messages = on
# This is only relevant when logging to eventlog (Windows):
# (change requires restart)
#event_source = 'PostgreSQL'
# - When to Log -
#log_min_messages = warning # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#log_min_error_statement = error # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic (effectively off)
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
#log_min_duration_sample = -1 # -1 is disabled, 0 logs a sample of statements
# and their durations, > 0 logs only a sample of
# statements running at least this number
# of milliseconds;
# sample fraction is determined by log_statement_sample_rate
#log_statement_sample_rate = 1.0 # fraction of logged statements exceeding
# log_min_duration_sample to be logged;
# 1.0 logs all such statements, 0.0 never logs
#log_transaction_sample_rate = 0.0 # fraction of transactions whose statements
# are logged regardless of their duration; 1.0 logs all
# statements from all transactions, 0.0 never logs
#log_startup_progress_interval = 10s # Time between progress updates for
# long-running startup operations.
# 0 disables the feature, > 0 indicates
# the interval in milliseconds.
# - What to Log -
#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_autovacuum_min_duration = 10min # log autovacuum activity;
# -1 disables, 0 logs all actions and
# their durations, > 0 logs only
# actions running at least this number
# of milliseconds.
#log_checkpoints = on
#log_connections = '' # log aspects of connection setup
# options include receipt, authentication, authorization,
# setup_durations, and all to log all of these aspects
#log_disconnections = off
#log_duration = off # log statement duration
#log_error_verbosity = default # terse, default, or verbose messages
#log_hostname = off
#log_line_prefix = '%m [%p] ' # special values:
# %a = application name
# %u = user name
# %d = database name
# %r = remote host and port
# %h = remote host
# %b = backend type
# %p = process ID
# %P = process ID of parallel group leader
# %t = timestamp without milliseconds
# %m = timestamp with milliseconds
# %n = timestamp with milliseconds (as a Unix epoch)
# %Q = query ID (0 if none or not computed)
# %i = command tag
# %e = SQL state
# %c = session ID
# %l = session line number
# %s = session start timestamp
# %v = virtual transaction ID
# %x = transaction ID (0 if none)
# %q = stop here in non-session
# processes
# %% = '%'
# e.g. '<%u%%%d> '
#log_lock_waits = off # log lock waits >= deadlock_timeout
#log_lock_failure = off # log lock failures
#log_recovery_conflict_waits = off # log standby recovery conflict waits
# >= deadlock_timeout
#log_parameter_max_length = -1 # when logging statements, limit logged
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_parameter_max_length_on_error = 0 # when logging an error, limit logged
# bind-parameter values to N bytes;
# -1 means print in full, 0 disables
#log_statement = 'none' # none, ddl, mod, all
#log_replication_commands = off
#log_temp_files = -1 # log temporary files equal or larger
# than the specified size in kilobytes;
# -1 disables, 0 logs all temp files
log_timezone = 'Asia/Tokyo'
# - Process Title -
#cluster_name = '' # added to process titles if nonempty
# (change requires restart)
#update_process_title = on
#------------------------------------------------------------------------------
# STATISTICS
#------------------------------------------------------------------------------
# - Cumulative Query and Index Statistics -
#track_activities = on
#track_activity_query_size = 1024 # (change requires restart)
#track_counts = on
#track_cost_delay_timing = off
#track_io_timing = off
#track_wal_io_timing = off
#track_functions = none # none, pl, all
#stats_fetch_consistency = cache # cache, none, snapshot
# - Monitoring -
#compute_query_id = auto
#log_statement_stats = off
#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#------------------------------------------------------------------------------
# VACUUMING
#------------------------------------------------------------------------------
# - Automatic Vacuuming -
#autovacuum = on # Enable autovacuum subprocess? 'on'
# requires track_counts to also be on.
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
#autovacuum_vacuum_insert_threshold = 1000 # min number of row inserts
# before vacuum; -1 disables insert
# vacuums
#autovacuum_analyze_threshold = 50 # min number of row updates before
# analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_vacuum_insert_scale_factor = 0.2 # fraction of unfrozen pages
# before insert vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_vacuum_max_threshold = 100000000 # max number of row updates
# before vacuum; -1 disables max
# threshold
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
# (change requires restart)
#autovacuum_multixact_freeze_max_age = 400000000 # maximum multixact age
# before forced vacuum
# (change requires restart)
#autovacuum_vacuum_cost_delay = 2ms # default vacuum cost delay for
# autovacuum, in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0 # 0-100 milliseconds (0 disables)
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 2 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 1-10000 credits
# - Default Behavior -
#vacuum_truncate = on # enable truncation after vacuum
# - Freezing -
#vacuum_freeze_table_age = 150000000
#vacuum_freeze_min_age = 50000000
#vacuum_failsafe_age = 1600000000
#vacuum_multixact_freeze_table_age = 150000000
#vacuum_multixact_freeze_min_age = 5000000
#vacuum_multixact_failsafe_age = 1600000000
#vacuum_max_eager_freeze_failure_rate = 0.03 # 0 disables eager scanning
#------------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------------------------------------------
# - Statement Behavior -
#client_min_messages = notice # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
#search_path = '"$user", public' # schema names
#row_security = on
#default_table_access_method = 'heap'
#default_tablespace = '' # a tablespace name, '' uses the default
#default_toast_compression = 'pglz' # 'pglz' or 'lz4'
#temp_tablespaces = '' # a list of tablespace names, '' uses
# only default tablespace
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#default_transaction_deferrable = off
#session_replication_role = 'origin'
#statement_timeout = 0 # in milliseconds, 0 is disabled
#transaction_timeout = 0 # in milliseconds, 0 is disabled
#lock_timeout = 0 # in milliseconds, 0 is disabled
#idle_in_transaction_session_timeout = 0 # in milliseconds, 0 is disabled
#idle_session_timeout = 0 # in milliseconds, 0 is disabled
#bytea_output = 'hex' # hex, escape
#xmlbinary = 'base64'
#xmloption = 'content'
#gin_pending_list_limit = 4MB
#createrole_self_grant = '' # set and/or inherit
#event_triggers = on
# - Locale and Formatting -
datestyle = 'iso, mdy'
#intervalstyle = 'postgres'
timezone = 'Asia/Tokyo'
#timezone_abbreviations = 'Default' # Select the set of available time zone
# abbreviations. Currently, there are
# Default
# Australia (historical usage)
# India
# You can create your own file in
# share/timezonesets/.
#extra_float_digits = 1 # min -15, max 3; any value >0 actually
# selects precise output mode
#client_encoding = sql_ascii # actually, defaults to database
# encoding
# These settings are initialized by initdb, but they can be changed.
lc_messages = 'C.UTF-8' # locale for system error message
# strings
lc_monetary = 'C.UTF-8' # locale for monetary formatting
lc_numeric = 'C.UTF-8' # locale for number formatting
lc_time = 'C.UTF-8' # locale for time formatting
#icu_validation_level = warning # report ICU locale validation
# errors at the given level
# default configuration for text search
default_text_search_config = 'pg_catalog.english'
# - Shared Library Preloading -
#local_preload_libraries = ''
#session_preload_libraries = ''
#shared_preload_libraries = '' # (change requires restart)
#jit_provider = 'llvmjit' # JIT library to use
# - Other Defaults -
#dynamic_library_path = '$libdir'
#extension_control_path = '$system'
#gin_fuzzy_search_limit = 0
#------------------------------------------------------------------------------
# LOCK MANAGEMENT
#------------------------------------------------------------------------------
#deadlock_timeout = 1s
#max_locks_per_transaction = 64 # min 10
# (change requires restart)
#max_pred_locks_per_transaction = 64 # min 10
# (change requires restart)
#max_pred_locks_per_relation = -2 # negative values mean
# (max_pred_locks_per_transaction
# / -max_pred_locks_per_relation) - 1
#max_pred_locks_per_page = 2 # min 0
#------------------------------------------------------------------------------
# VERSION AND PLATFORM COMPATIBILITY
#------------------------------------------------------------------------------
# - Previous PostgreSQL Versions -
#array_nulls = on
#backslash_quote = safe_encoding # on, off, or safe_encoding
#escape_string_warning = on
#lo_compat_privileges = off
#quote_all_identifiers = off
#standard_conforming_strings = on
#synchronize_seqscans = on
# - Other Platforms and Clients -
#transform_null_equals = off
#allow_alter_system = on
#------------------------------------------------------------------------------
# ERROR HANDLING
#------------------------------------------------------------------------------
#exit_on_error = off # terminate session on any error?
#restart_after_crash = on # reinitialize after backend crash?
#data_sync_retry = off # retry or panic on failure to fsync
# data?
# (change requires restart)
#recovery_init_sync_method = fsync # fsync, syncfs (Linux 5.8+)
#------------------------------------------------------------------------------
# CONFIG FILE INCLUDES
#------------------------------------------------------------------------------
# These options allow settings to be loaded from files other than the
# default postgresql.conf. Note that these are directives, not variable
# assignments, so they can usefully be given more than once.
#include_dir = '...' # include files ending in '.conf' from
# a directory, e.g., 'conf.d'
#include_if_exists = '...' # include file only if it exists
#include = '...' # include file
#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------
# Add settings for extensions here
@ryogrid
Copy link
Author

ryogrid commented Apr 5, 2025

今回の測定結果(のみ)について整理すると、

  • io_uringによる非同期I/Oの効用があるケースの存在は確認できた
    • 要注意点
      • WSL2は仮想化環境であり、ベアメタル環境と比べるとその多寡はどうあれシステムコールにに追加のオーバヘッドが存在する
      • 上述のオーバヘッドが無視できない程度に大きく、結果としてシステムコール呼び出しが減るio_uringを利用した非同期I/Oの効用が大きくなった可能性はゼロではない
  • ほぼ同様の条件でも、チェックポインティングが頻繁に起きるようであるとio_uringの効用は見られないようであった
    • チェックポインティングもストレージI/Oバウンドな処理のはずなので、多発してしまったらしてしまったで効用が見られそうに思うが、そうはなっていない
    • ここについては、(私の現状のpostgresやio_uringに関する知識では)パッと思いつくうまい説明が思いつかない
      • (postgresのログを追うといった調査はしていない)
  • 効用が見られたケースでのストレージI/O、メモリ使用量、CPU使用率をグラフィカルなツールで眺めていての情報および簡単な考察(推測含)
    • ベンチマーク内でアクセス、また作成するデータは、確実ではないもののメモリ上に載っていたようであった
    • CPU使用率はしばらく100%近くに張り付いて、その期間の半分強程度の間、使用率が0%に近いあたりに落ちていき、また100%近くに戻っていく"谷" が見られた。また、その2つの期間の間の遷移は測定中継続的に見られた
      • チェックポインティングは止めていた(つもり)なので、"谷" ができたのはVACUUMが要因ではないかと推測しているが、それを裏付ける情報はない
        • 張り付いている期間と谷の期間でストレージI/Oに変化があるかまでは照らし合わせて見ていなかった・・・
    • ストレージI/Oはほとんどがwriteのようであった
      • WALの書き出しが中心であったと推測しているが裏付ける情報は特にない

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment