Percona PostgreSQL Cluster Operations: A Public Guide to Management, Minor Upgrades, Performance Tuning, and Troubleshooting
Percona PostgreSQL Cluster Operations: A Public Guide to Management, Minor Upgrades, Performance Tuning, and Troubleshooting 1
2. ETCD Cluster Node Information 2
4. Upgrade Process Steps (Rolling Upgrade Node by Node) 3
4.1. Remove Previous Version of Percona pg_tde (If Applicable) 3
4.2. Update Percona Release Package 4
4.3. Install Required Dependencies and Enable Repositories 4
4.4. Configure Percona Distribution for PostgreSQL 17 Repository 5
4.5. Stop Patroni Service on the Current Node 6
4.6. Install/Upgrade Percona Distribution for PostgreSQL 17 Packages 6
4.7. Start Patroni Service on the Current Node 7
4.9. Repeat Steps for Other Replicas 7
4.10. Perform Patroni Switchover (Before upgrading the original primary) 8
4.11. Upgrade the Original Primary Node (Now a Replica) 8
4.12. Configure pg_tde in PostgreSQL Configuration (If not already done) 9
4.13. Perform Patroni Cluster-Wide Restart to Apply Configuration Changes 9
4.14. Optional: Perform a Patroni Switchover for Verification 10
5. Performance Tuning Considerations 11
6. Post-Upgrade Verification 14
7. Handling Potential Issues 14
8. Understanding patronictl list Output 15
LinuxMalaysia / Harisfazillah Blog https://linuxmalaysia.harisfazillah.info/2025/07/percona-postgresql-cluster-operations.html
This comprehensive guide details the procedures for managing and performing a minor version upgrade of a Percona Distribution for PostgreSQL cluster, which is orchestrated by Patroni and relies on etcd for distributed consensus. This document also includes an explanation of key diagnostic outputs for cluster monitoring. All sensitive information, such as specific IP addresses, hostnames, usernames, and cluster names, has been anonymised to protect privacy for public reference.
Important Note: This document outlines steps specifically for a minor version upgrade within the PostgreSQL 17 series (e.g., from 17.2 to 17.4). Major version upgrades (e.g., from PostgreSQL 16 to 17) typically involve different procedures, often utilising pg_upgrade or logical replication, which are not covered here.
The following nodes comprise the Percona Distribution for PostgreSQL cluster managed by Patroni:
Node Name | IP Address | Role |
---|---|---|
pg-node-1 | 192.168.1.101 | Leader |
pg-node-2 | 192.168.1.102 | Replica |
pg-node-3 | 192.168.1.103 | Replica |
Note: The roles (Leader/Replica) are dynamic and managed by Patroni based on cluster state and health.
The Patroni instances in this PostgreSQL cluster rely on a 5-node etcd cluster for distributed consensus, leader election, and state storage. The etcd cluster nodes are:
IP Address | Hostname | Short Name |
---|---|---|
192.168.1.201 | etcd-node-1 | etcd-node-1 |
192.168.1.101 | pg-node-1 | pg-node-1 |
192.168.1.102 | pg-node-2 | pg-node-2 |
192.168.1.103 | pg-node-3 | pg-node-3 |
192.168.1.202 | etcd-node-2 | etcd-node-2 |
Note: Some nodes may serve multiple roles (e.g., hosting both PostgreSQL and etcd members) for resource efficiency or specific architectural designs.
Before commencing any upgrade or significant maintenance, ensure the following critical prerequisites are met:
- Validated Backup: A recent, successful, and validated backup of the entire database cluster is paramount. Verify that the backup can be restored accurately to a test environment. This provides an essential rollback point in case of unforeseen issues during the upgrade process.
- Cluster Health: Confirm the Patroni cluster is operating optimally. All nodes (primary and replicas) should be running, healthy, and replicating correctly. Use patronictl list to check the PostgreSQL cluster state and verify the health of the etcd cluster separately.
- System Resources: Ensure adequate disk space is available on all nodes for the package upgrade, any temporary files, and future operational growth.
- Network Connectivity: Verify robust network connectivity between all cluster nodes (PostgreSQL and etcd members) and any external systems (e.g., application servers, monitoring systems, backup targets) that interact with the cluster. Pay close attention to all required ports.
Perform the following steps on each node in the cluster, adhering to a rolling upgrade approach. It is crucial to start with replica nodes and only upgrade the primary after a controlled switchover, to minimise service disruption.
dnf remove percona-pg_tde_16 |
---|
- Explanation: If the Percona Transparent Data Encryption (TDE) plugin was installed for a previous major PostgreSQL version (e.g., PostgreSQL 16), its version-specific package (percona-pg_tde_16) must be removed prior to upgrading the core PostgreSQL server binaries to version 17. This action prevents potential compatibility conflicts and ensures a clean installation for the new PostgreSQL 17 libraries. A compatible version of pg_tde for PostgreSQL 17 will be installed as part of the subsequent package upgrade. Refer to the Percona TDE installation documentation for further details. https://docs.percona.com/pg-tde/yum.html\#install-percona-release
dnf update percona-release |
---|
- Explanation: The percona-release package manages the configuration of Percona's software repositories on your system. Updating this package ensures that your system's package manager (dnf) has access to the latest repository definitions and GPG keys required to fetch packages for Percona Distribution for PostgreSQL 17. This is a necessary prerequisite to make the new version's packages available for installation or upgrade. This step aligns with the minor upgrade procedure outlined by Percona.
dnf install dnf-plugins-core -y�dnf config-manager --set-enabled crb�dnf install perl-IPC-Run -y |
---|
- Explanation:
- dnf install dnf-plugins-core -y: Installs essential plugins for DNF, including the config-manager plugin used in the subsequent step. The -y flag confirms the installation automatically.
- dnf config-manager --set-enabled crb: Enables the CodeReady Linux Builder (CRB) repository. This repository provides supplementary packages, including developer tools and libraries, which may be dependencies for Percona software components like Patroni or certain PostgreSQL extensions not found in the standard distribution repositories.
- dnf install perl-IPC-Run -y: Installs the perl-IPC-Run Perl module. This module is often a dependency for various system administration utilities or scripts, potentially including those used by Patroni or other components of the Percona stack for managing inter-process communication.
dnf install epel-release -y |
---|
- Explanation: Installs the repository configuration for Extra Packages for Enterprise Linux (EPEL). EPEL is a widely-used repository providing additional software packages for Enterprise Linux distributions. Patroni and some of its dependencies are frequently sourced from or rely on libraries found in EPEL. The -y flag confirms the installation automatically.
dnf install dnf-plugins-core -y�dnf config-manager --set-enabled crb |
---|
- Explanation: These commands re-confirm the installation of DNF plugins and the enablement of the CRB repository. While potentially redundant if the previous steps were successful, they ensure these prerequisites are firmly in place before configuring the specific Percona repository.
dnf -y install curl |
---|
- Explanation: Installs the curl utility. curl is a versatile command-line tool used for transferring data via URLs. It is commonly used for downloading files, interacting with REST APIs (such as those provided by Patroni or etcd), and verifying network connectivity, making it a valuable tool for system administration and troubleshooting in a distributed environment. These dependency installations are standard steps often required before installing Percona Distribution for PostgreSQL 17. https://docs.percona.com/postgresql/17/installing.html
percona-release setup ppg17 |
---|
-
Explanation: This command, provided by the percona-release package, configures your system's package manager to use the official Percona repositories specifically for Percona Distribution for PostgreSQL 17. It automatically disables any older Percona PostgreSQL repository configurations to ensure that package operations target the desired version. This is a key step in the Percona PostgreSQL 17 installation process. https://docs.percona.com/postgresql/17/installing.html
-
Output:
[root@pg-node-3 ~]# percona-release setup ppg17
* Disabling all Percona Repositories
* Enabling the Percona Distribution for PostgreSQL 17 repository
<*> All done!- This output confirms that the repository setup was successful. All previous Percona repositories were disabled, and the repository for Percona Distribution for PostgreSQL 17 (ppg17) was enabled.
systemctl stop patroni |
---|
- Explanation: Patroni is the high-availability agent responsible for managing the PostgreSQL instances. To safely perform package upgrades on a node, the Patroni service must be stopped on that specific node before modifying the PostgreSQL binaries or related software. This action prevents Patroni from attempting to manage or restart the database instance while it is in an inconsistent state during the upgrade process.
dnf install percona-ppg-server17�dnf upgrade |
---|
- Explanation:
- dnf install percona-ppg-server17: This command ensures the meta-package for Percona Distribution for PostgreSQL 17 is installed or updated, resolving dependencies against the new repository.
- dnf upgrade: This command performs the actual upgrade of all installed packages on the system with newer versions available, including the core PostgreSQL 17 server binaries, client libraries, contrib modules, Patroni, etcd, pgBackRest, and Percona-specific extensions. This updates the software on the current node to version 17.4.
- Output (Combined Example):
[root@pg-node-3 ~]# dnf install percona-ppg-server17�... (output showing upgrade of percona-ppg-server17 17.2 -> 17.4) ...�Complete!�[root@pg-node-3 ~]# dnf upgrade�... (output showing upgrade of various packages including postgresql17-server, patroni, etcd etc. to their 17.4 compatible versions) ...�Upgraded:� elrepo-release-9.5-2.el9.elrepo.noarch etcd-3.5.18-1.el9.x86_64 percona-patroni-1:4.0.5-1.el9.x86_64� percona-pg-telemetry17-1:1.1.0-1.el9.x86_64 percona-pg_repack17-1:1.5.2-1.el9.x86_64 percona-pg_stat_monitor17-1:2.1.1-1.el9.x86_64� percona-pgaudit17-1:17.1-1.el9.x86_64 percona-pgbackrest-1:2.54.2-1.el9.x86_64 percona-postgresql-client-common-1:267-1.el9.noarch� percona-postgresql-common-1:267-1.el9.noarch percona-postgresql17-1:17.4-1.el9.x86_64 percona-postgresql17-contrib-1:17.4-1.el9.x86_64� percona-postgresql17-libs-1:17.4-1.el9.x86_64 percona-postgresql17-server-1:17.4-1.el9.x86_64�Complete!� |
---|
- This output confirms that the core PostgreSQL 17 components and related software like Patroni and etcd were successfully upgraded to their respective latest versions compatible with Percona Distribution for PostgreSQL 17.4.
systemctl start patroni |
---|
- Explanation: After successfully upgrading the packages on the current node, restart the Patroni service. Patroni will initiate the start of the PostgreSQL instance on this node, and it should automatically attempt to rejoin the cluster, typically as a streaming replica. Starting Patroni immediately after the package upgrade on a node is crucial for maintaining the cluster's high availability during the rolling upgrade process.
patronictl -c /etc/patroni/postgresql.yml list |
---|
- Explanation: Use patronictl list to verify that the node you just upgraded and restarted Patroni on has successfully rejoined the cluster and is in a healthy state (e.g., streaming for a replica). Ensure replication lag is minimal before proceeding to the next node.
Repeat steps 4.5 through 4.8 on each remaining replica node in the cluster, one node at a time.
- Explanation: Continue the process of stopping Patroni, upgrading packages, starting Patroni, and verifying health for all other replica nodes in the cluster. Always ensure a node is healthy and streaming before moving to the next replica.
### Perform a switchover to make an upgraded replica the new primary
patronictl -c /etc/patroni/postgresql.yml switchover --candidate <name_of_an_upgraded_replica> my-pg-cluster |
---|
- Explanation: Once all replica nodes have been successfully upgraded and are healthy, perform a controlled switchover using patronictl. Choose one of the already upgraded replicas to become the new primary. This step is vital because it allows you to upgrade the original primary node with minimal downtime for write operations, as the cluster will continue serving traffic via the new primary during the original primary's upgrade window. Replace <name_of_an_upgraded_replica> with the actual name of one of the upgraded replica nodes (e.g., pg-node-2 or pg-node-3).
# Identify the original primary node (which is now a replica after switchover)�systemctl stop patroni # Stop Patroni on the original primary node�# Repeat steps 4.6 (dnf install/upgrade) on this node�# Repeat step 4.7 (systemctl start patroni) on this node�# Repeat step 4.8 (patronictl list) to verify it rejoins as a healthy replica |
---|
- Explanation: Now that the original primary node is serving as a replica, repeat the package upgrade process (steps 4.6 and 4.7) on it. Stop Patroni, perform the dnf install/upgrade, and then start Patroni. Verify using patronictl list that this node successfully rejoins the cluster as a healthy replica.
# Connect to the CURRENT primary node (the one promoted in step 4.10)�sudo -u postgres /bin/bash # Switch to postgres user�psql # Connect to postgres�```sql�ALTER SYSTEM SET shared_preload_libraries = 'pg_tde';�```bash�exit # Exit psql�exit # Exit postgres user shell |
---|
- Explanation: This step ensures the shared_preload_libraries setting to load pg_tde is applied using ALTER SYSTEM. This command modifies the postgresql.conf file. It must be executed while connected to the current primary node of the cluster (the one promoted in step 4.10). This change requires a full PostgreSQL server restart to take effect, which will be done in the next step.
patronictl -c /etc/patroni/postgresql.yml restart my-pg-cluster |
---|
- Explanation: This command uses patronictl to initiate a controlled, rolling restart of the entire cluster named my-pg-cluster. This restart is necessary specifically to apply any configuration changes (like shared_preload_libraries) that only take effect when the PostgreSQL server process starts. Patroni will handle the restart sequence, typically restarting replicas first and then the primary, ensuring minimal impact on availability.
- Output:
[root@pg-node-1 ~]# patronictl -c /etc/patroni/postgresql.yml list
+ Cluster: my-pg-cluster (7465686895147752418) -------+-----------+-----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+------------------+---------------+---------+-----------+-----+-----------+
| pg-node-1 | 192.168.1.101 | Leader | running | 236 | |
| pg-node-2 | 192.168.1.102 | Replica | streaming | 236 | 0 |
| pg-node-3 | 192.168.1.103 | Replica | streaming | 236 | 0 |
+------------------+---------------+---------+-----------+-----+-----------+
When should the restart take place (e.g. 2025-04-21T05:34) [now]:
Are you sure you want to restart members pg-node-1, pg-node-2, pg-node-3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []:�Success: restart on member pg-node-1�Success: restart on member pg-node-2�Success: restart on member pg-node-3 |
---|
- The output shows the cluster state before the restart request, prompts for confirmation, and reports successful initiation of the rolling restart on each member.
| [root@pg-node-1 ~]# patronictl -c /etc/patroni/postgresql.yml list�+ Cluster: my-pg-cluster (7465686895147752418) -------+-----------+-----+-----------+�| Member | Host | Role | State | TL | Lag in MB |�+------------------+---------------+---------+-----------+-----+-----------+�| pg-node-1 | 192.168.1.101 | Leader | running | 236 | |�| pg-node-2 | 192.168.1.102 | Replica | streaming | 236 | 0 |�| pg-node-3 | 192.168.1.103 | Replica | streaming | 236 | 0 |�+------------------+---------------+---------+-----------+-----+-----------+� | | :---- |
- Explanation: Checking the cluster status again with patronictl list after the rolling restart confirms that all cluster members have successfully come back online and are in a healthy state (running for the primary, streaming for replicas) with minimal lag.
### just incase primary/leader change to other node
patronictl -c /etc/patroni/postgresql.yml switchover --leader pg-node-3 --candidate pg-node-1 my-pg-cluster |
---|
- Explanation: This command demonstrates initiating a manual switchover to test the failover mechanism post-upgrade or to return the primary role to a preferred node. It requests Patroni to transfer leadership from the current primary (pg-node-3 in this example, assuming a previous switchover or failover made it primary) to pg-node-1.
patronictl -c /etc/patroni/postgresql.yml list |
---|
- Explanation: Running patronictl list after the switchover verifies that the primary role has successfully transitioned to the target node.
PostgreSQL performance tuning is a critical aspect of database administration, ensuring the system efficiently handles its workload. Many parameters, especially those that allocate memory or manage background processes, require a PostgreSQL instance restart to take effect. The "Pending restart reason" in patronictl list often highlights changes to such parameters.
Here's an explanation of common performance-related parameters and their tuning considerations:
- autovacuum_max_workers:
- Purpose: Controls the maximum number of autovacuum launcher processes that can run simultaneously. Autovacuum is essential for reclaiming space from updated or deleted rows and updating statistics for the query planner.
- Tuning: Increasing this value (e.g., from 3 to 6) allows more autovacuum processes to run concurrently, which can improve the speed at which dead tuples are cleaned up and statistics are updated, especially on busy systems with high write activity. This helps prevent table bloat and ensures the query planner has up-to-date information, leading to better query performance.
- Implications: More workers consume more CPU and I/O resources. Careful monitoring is needed to avoid resource contention.
- huge_pages:
- Purpose: Determines whether PostgreSQL attempts to use Linux "huge pages" for shared memory segments. Huge pages are larger memory pages (typically 2MB or 1GB) than standard 4KB pages.
- Tuning: Setting this to on or try (as seen in try->off suggesting it was turned off) can improve performance by reducing Translation Lookaside Buffer (TLB) misses, which are CPU cache misses related to memory address translation. This can lead to lower CPU overhead for memory management, especially with large shared_buffers settings.
- Implications: Requires proper operating system configuration (e.g., vm.nr_hugepages kernel parameter). If not configured correctly, PostgreSQL might fail to start or fall back to standard pages. Turning it off (as per the example) might be a troubleshooting step if huge pages were causing issues, or a decision to simplify memory management if the performance benefit wasn't realised or outweighed by complexity.
- max_connections:
- Purpose: Sets the maximum number of concurrent client connections that the PostgreSQL server will accept.
- Tuning: Increasing this value (e.g., from 100 to 250) allows more applications or users to connect simultaneously. This is often necessary as application load grows.
- Implications: Each connection consumes a certain amount of memory and CPU resources. Setting max_connections too high without sufficient RAM can lead to out-of-memory errors or severe performance degradation due to excessive swapping. It's crucial to balance this with available system resources and application connection pooling strategies.
- max_replication_slots:
- Purpose: Defines the maximum number of replication slots that can be defined on the primary server. Replication slots prevent the primary from removing WAL (Write-Ahead Log) segments until they have been consumed by all configured replicas or logical decoding clients.
- Tuning: Increasing this value (e.g., from 10 to 15) allows for more replicas or logical decoding consumers to be configured. This is essential for scaling out read replicas or integrating with external systems that rely on logical decoding.
- Implications: Each active replication slot retains WAL segments. If a replica or consumer associated with a slot falls behind or disconnects for an extended period, WAL files will accumulate on the primary, potentially filling up disk space. Careful monitoring of replication slot usage and lag is critical.
- max_worker_processes:
- Purpose: Sets the maximum number of background worker processes that PostgreSQL can launch. These are generic processes that can be used by various features, including parallel query execution, logical replication, and custom background tasks.
- Tuning: Increasing this value (e.g., from 8 to 24) allows PostgreSQL to utilise more parallel processing capabilities for queries that can benefit from it, or to run more background tasks concurrently.
- Implications: Each worker process consumes resources. Setting this too high can lead to resource contention. The actual number of parallel workers used by a query is also limited by max_parallel_workers_per_gather and other parameters.
- shared_buffers:
- Purpose: Determines the amount of memory PostgreSQL dedicates to shared memory buffers. This is where data pages are cached when read from disk.
- Tuning: Increasing this value (e.g., from 128MB to 6GB) is often the most impactful performance tuning parameter. A larger shared_buffers allows more data to be held in memory, reducing the need to read from slower disk storage, especially for frequently accessed data.
- Implications: This memory is shared by all PostgreSQL processes. Setting it too high can lead to memory exhaustion or excessive swapping if the system doesn't have enough physical RAM. A common recommendation is to set it to 25% of total system RAM, but this can vary based on workload and other applications running on the server.
- superuser_reserved_connections:
- Purpose: Specifies the number of connection slots reserved for PostgreSQL superusers. These slots are reserved even when max_connections has been reached, ensuring that administrators can always connect to troubleshoot or perform maintenance.
- Tuning: Increasing this value (e.g., from 3 to 10) provides more reserved slots for superusers. This is primarily a safety and manageability setting rather than a direct performance tuning parameter.
- Implications: These connections are part of max_connections. Increasing this too much reduces the number of available slots for regular application users.
- wal_buffers:
- Purpose: The amount of shared memory used for the Write-Ahead Log (WAL) buffers. WAL records are written to these buffers before being flushed to disk.
- Tuning: Increasing this value (e.g., from 4MB to 16MB) can reduce the number of WAL buffer flushes to disk, potentially improving write performance, especially for workloads with many small transactions.
- Implications: A larger wal_buffers consumes more shared memory. While generally beneficial for write-heavy workloads, excessively large values may not provide further benefits and can waste memory.
- shared_preload_libraries:
- Purpose: Specifies one or more shared libraries to be preloaded into the PostgreSQL server at startup. This is typically used for extensions that need to integrate deeply with PostgreSQL's internal workings, such as pg_tde, pg_stat_statements, or performance monitoring tools.
- Tuning: Preloading libraries avoids the overhead of loading them on demand and allows them to hook into PostgreSQL's startup process. For performance, extensions like pg_stat_statements (for query analysis) or custom optimisers are often preloaded.
- Implications: If a preloaded library has a bug or is incompatible, it can prevent PostgreSQL from starting. Each preloaded library consumes some memory.
General Tuning Principles:
- Workload-Specific: Optimal tuning values are highly dependent on your specific workload (read-heavy, write-heavy, OLTP, OLAP), hardware resources (CPU, RAM, storage I/O), and application requirements.
- Iterative Process: Performance tuning is an iterative process. Change one parameter at a time, monitor the impact, and then adjust further.
- Monitoring is Key: Always monitor key metrics (CPU utilisation, memory usage, disk I/O, query response times, replication lag, autovacuum activity) before and after making changes to assess their effectiveness. Tools like pg_stat_statements, pg_activity, and external monitoring solutions are invaluable.
- Documentation: Document all changes made, including the rationale and observed impact.
After completing the package upgrades and cluster restart on all nodes, perform the following comprehensive verification steps:
- Cluster Status: Re-run patronictl list to confirm all nodes are healthy (running/streaming) and replication lag is minimal.
- PostgreSQL Version: Connect to the database using psql on each node and verify the version using SELECT version(); to ensure it shows Percona Distribution for PostgreSQL 17.4.
- Extension Loading: Connect to the database and verify that pg_tde and other expected extensions are loaded using SHOW shared_preload_libraries; and querying pg_available_extensions.
- Log Files: Review PostgreSQL and Patroni logs on all nodes for any errors or warnings that occurred during or after the upgrade and restarts.
- Application Connectivity: Test connectivity from your applications to the database cluster and verify basic read/write operations are successful.
- Replication Health: Perform more detailed checks on replication status using SELECT * FROM pg_stat_replication; on the primary node.
- Monitoring: Verify that your monitoring system is correctly collecting metrics from the upgraded cluster and that all nodes are reporting as healthy.
- Package Upgrade Failure: If dnf upgrade fails on a node, review the error messages carefully. Attempt to resolve dependency issues or conflicts. If necessary, roll back the package changes on that node using dnf history rollback <transaction_id>. Do not start Patroni on a node with a failed package upgrade.
- Patroni Restart Failure: If Patroni fails to start or the PostgreSQL instance doesn't become healthy after a restart (Step 4.7 or during the cluster restart in Step 4.13), check the Patroni logs (journalctl -u patroni) and PostgreSQL logs for error details. Common issues include configuration errors, port conflicts, or data directory problems.
- Replication Issues: If replication breaks after a node restart or the cluster restart, check firewall rules, PostgreSQL configuration (e.g., wal_level, max_wal_senders), and logs. Patroni may attempt to rebuild replicas automatically, but manual intervention might be required in some cases (e.g., using patronictl remove --force and allowing Patroni to recreate the replica from the primary).
- Rollback Plan: In case of severe issues that cannot be resolved, the primary rollback strategy is to restore the cluster from the validated backup taken before the upgrade.
Here's an explanation of the patronictl list command output, which provides a summary of your Patroni-managed PostgreSQL cluster:
| [admin_user@pg-node-a ~]$ sudo patronictl -c /etc/patroni/postgresql.yml list�[sudo] password for admin_user:�+ Cluster: my-pg-cluster (7445859937569462303) -------+-----------+-----+-----------+�| Member | Host | Role | State | TL | Lag in MB | Pending restart | Pending restart reason |�+-----------+---------------+---------+-----------+----+-----------+-----------------+-----------------------------------+�| pg-node-a | 192.168.1.10 | Leader | running | 6 | | * | autovacuum max workers: 3->6 |�| | | | | | | | huge_pages: try->off |�| | | | | | | | max_connections: 100->250 |�| | | | | | | | max_replication_slots: 10->15 |�| | | | | | | | max_worker_processes: 8->24 |�| | | | | | | | shared_buffers: 128MB->6GB |�| | | | | | | | shared_preload_libraries: [hidden - too long] |�| | | | | | | | superuser_reserved_connections: 3->10 |�| | | | | | | | wal_buffers: 4MB->16MB |�+-----------+---------------+---------+-----------+----+-----------+-----------------+-----------------------------------+�| pg-node-b | 192.168.1.11 | Replica | streaming | 6 | 0 | * | autovacuum max workers: 3->6 |�| | | | | | | | huge_pages: try->off |�| | | | | | | | max_connections: 100->250 |�| | | | | | | | max_replication_slots: 10->15 |�| | | | | | | | max_worker_processes: 8->24 |�| | | | | | | | shared_buffers: 128MB->6GB |�| | | | | | | | shared_preload_libraries: [hidden - too long] |�| | | | | | | | superuser_reserved_connections: 3->10 |�| | | | | | | | wal_buffers: 4MB->16MB |�+-----------+---------------+---------+-----------+----+-----------+-----------------+-----------------------------------+�| pg-node-c | 192.168.1.12 | Replica | streaming | 6 | 0 | * | autovacuum max workers: 3->6 |�| | | | | | | | huge_pages: try->off |�| | | | | | | | max_connections: 100->250 |�| | | | | | | | max_worker_processes: 8->24 |�| | | | | | | | shared_buffers: 128MB->6GB |�| | | | | | | | shared_preload_libraries: [hidden - too long] |�| | | | | | | | superuser_reserved_connections: 3->10 |�| | | | | | | | wal_buffers: 4MB->16MB |�+-----------+---------------+---------+-----------+----+-----------+-----------------+-----------------------------------+�[admin_user@pg-node-a ~]$� | | :---- |
Explanation of the Output:
- + Cluster: my-pg-cluster (7445859937569462303): This line identifies the name of your Patroni cluster (my-pg-cluster) and its unique internal ID.
- | Member | Host | Role | State | TL | Lag in MB |: These are the standard columns providing essential information about each node in the cluster:
- Member: The name Patroni uses for the specific PostgreSQL instance (e.g., pg-node-a).
- Host: The IP address of the node.
- Role: The current role of the PostgreSQL instance within the cluster.
- Leader: This node is currently the primary PostgreSQL instance, handling all write operations and serving as the source for replication.
- Replica: These nodes are standby PostgreSQL instances, receiving changes from the primary via streaming replication.
- State: The current operational status of the PostgreSQL instance.
- running: The PostgreSQL instance is active and operational.
- streaming: The replica is actively receiving and applying changes from the primary.
- TL (Timeline): Represents the PostgreSQL timeline ID. This increments on certain events, such as a failover or switchover, and helps ensure consistency in replication history. A consistent timeline across all nodes indicates healthy replication.
- Lag in MB: The amount of replication lag, measured in megabytes, between the replica and the primary. A value of 0 indicates the replica is fully caught up.
- | Pending restart | Pending restart reason |: This is the crucial part that indicates necessary actions.
- Pending restart: The asterisk (*) in this column for all three nodes signifies that there are configuration changes that have been applied to the PostgreSQL configuration (either directly in postgresql.conf or via ALTER SYSTEM SET commands) but have not yet taken effect. For these changes to become active, the PostgreSQL instance on that specific node needs to be restarted. Patroni actively monitors for such pending changes and flags them here.
- Pending restart reason: This column provides specific details about which PostgreSQL parameters have been modified and are awaiting a restart. In your output, it lists several parameters and their old and new values (e.g., autovacuum_max_workers: 3->6, shared_buffers: 128MB->6GB, wal_buffers: 4MB->16MB, etc.). The entry shared_preload_libraries: [hidden - too long] indicates that this parameter was also changed, but its full value isn't displayed due to length constraints in this output format.
In summary, this output indicates that your PostgreSQL cluster has several pending configuration changes across all its nodes that require a restart of each PostgreSQL instance for them to be fully applied and become active. Patroni is aware of these pending restarts and is ready to manage them.
To apply these changes and ensure your cluster is running with the updated configuration, you would typically perform a controlled rolling restart of the cluster using Patroni's command-line interface, for example:
patronictl -c /etc/patroni/postgresql.yml restart my-pg-cluster
This command would instruct Patroni to orchestrate the restarts of your PostgreSQL instances in a safe manner, typically restarting replicas first and then the primary, to minimise service disruption.
This comprehensive guide serves as a reference for performing minor version upgrades and understanding key diagnostic outputs for a Percona Distribution for PostgreSQL cluster managed by Patroni and etcd. By following these detailed steps and best practices, administrators can ensure a smooth upgrade process and maintain the high availability and performance of their PostgreSQL environment. Regular verification and adherence to security best practices are essential for a robust and secure database infrastructure.
Prepared By : Harisfazillah Jamel and Google Gemini, LinuxMalaysia. 26 Jul 2025.