Skip to content

Instantly share code, notes, and snippets.

@ca4ti
Forked from LeonStoldt/conclusion.md
Created September 23, 2024 19:24
Show Gist options
  • Save ca4ti/ec300e6adc9298f380ca2d1bd7c099da to your computer and use it in GitHub Desktop.
Save ca4ti/ec300e6adc9298f380ca2d1bd7c099da to your computer and use it in GitHub Desktop.
Postgres Optimization and Performance Tuning

Postgres Optimization and Performance Tuning

based on the following article: An Introduction to PostgreSQL Performance Tuning and Optimization

Resource Usage

  • huge pages: Enabling on Linux will give a performance boost to PostgreSQL (VMPeak / HugePageSize = 4310)
  • max_connections: optimal number is roughly 4 times the number of CPU cores. The recommended number is the GREATEST(4 x CPU cores, 100). Beyond this number, a connection pooler such as pgbouncer should be used. avoid setting max_connections too high as it will increase the size of various data structures in Postgres which can result in CPU cycles being wasted
  • shared_buffers: variable by workload - starting Point: LEAST(RAM/2, 10GB).
  • work_mem: ((Total RAM - shared_buffers)/(16 x CPU cores))
  • maintenance_work_mem: 1GB good start
  • effective_io_concurrency: SSD = 200, HDD = Number of Disks

Write Ahead Log

  • wal_compression: compresses a full-page image written to WAL -> Set this parameter to 'on' as most database servers are likely to be bottlenecked on I/O rather than CPU
  • **wal_log_hints **required for pg_rewind -> set it "on"
  • wal_buffers: 16MB each WAL segment by default, Larger buffer sizes have been observed to have a potentially very positive effect on performance -> 64MB recommended
  • checkpoint_timeout: Longer timeouts reduce overall WAL volume but make crash recovery take longer. Minimum: 15 min -> RPO of business requirements dictates what this should be
  • checkpoint_completion_target: 0.9 recommended value
  • max_wal_size: recommended is half to two-thirds of the available disk space where the WAL is located
  • archive_mode: should be 'on'
  • archive_command: a valid archive_command is required if archive_mode is on

Query Tuning:

  • random_page_cost: SSD -> 1.1; HDD -> default should be fine
  • effective_cache_size: value = shared_buffers + Linux buffer cache (refers to caches in main memory, not CPU cache)
  • cpu_tuple_cost: default = 0.01; should be increased to 0.03

Reporting and Logging:

  • logging_collector: if log_destination includes stderr or csvlog -> on
  • log_directory: If logging_collector is on -> location outside the data directory
  • log_checkpoints: should be on
  • log_line_prefix: Suggested value: '%m [%p-%l] %u@%d app=%a '
  • log_lock_waits: should be on
  • log_statement: set to 'ddl'
  • log_temp_files: set to 0
  • timed_statistics (EPAS): When set to on, timing data is collected. Set this parameter to on.

Autovacuum:

  • log_autovacuum_min_duration: suggested: 0 - tune by Monitoring autovacuum activity
  • autovacuum_max_workers: default = 3; suggested = 5
  • autovacuum_vacuum_cost_limit: recommended = 3000

Client Connection Defaults:

  • idle_in_transaction_session_timeout: timer will terminate sessions that remain idle in a transaction for too long (if application can handle it, suggested: 10 minutes)
  • lc_messages: Set this to 'C' to avoid translation
  • shared_preload_libraries: Adding pg_stat_statements is low overhead and high value (recommended but optional)  

Finding slow queries:

Take a look at...

  • the log_min_duration_statement parameter (indicates how long a query must run before it is sent to the log file); and
  • the pg_stat_statements module and extension (module does is record every single (completed) query that the server executes)
  • your queries by using EXPLAIN to analyze them and rewrite slow queries to be more efficient
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment