If the database has less than 12GB RAM, reduce the below settings.
Note that you can set shared_buffers
and effective_cache_size
higher on the primary database because they generally receive less traffic.
# This value should match the "SqlSettings.MaxConnections" value within your config.json for Mattermost
# This is a suggestion and can be set lower / higher based on the size of your server.
max_connections = 1020
tcp_keepalives_idle = 5
tcp_keepalives_interval = 1
tcp_keepalives_count = 5
# Set both of the below settings to 65% of total memory. For a 32 GB instance, it should be 21 GB.
# If on a smaller server, set this to 20% or less total RAM.
# ex: 512MB would work for a 4GB RAM server
shared_buffers = 512MB
effective_cache_size = 512MB
# Set it to 16 MB for readers and 32 MB for writers. If it's a single instance, 16 MB should be sufficient. If the instance is of a lower capacity than r5.xlarge, then set it to a lower number.
work_mem = 16MB
# 1GB (reduce this to 512MB if your server has less than 32GB of RAM)
autovacuum_work_mem = 512MB
autovacuum_max_workers = 4
autovacuum_vacuum_cost_limit = 500
#Set it to 1.1 unless the DB is using spinning disks.
random_page_cost = 1.1
restart_after_crash = on
In addition to the above settings
# connection string to sending server
# This was created when you added a replication role to the primary database.
# username - replace "test" with the role you made
# password - replace "testpassword" with the role password
# host - replace "x.x.x.x" with your IP or URL string.
primary_conninfo = 'user=test password=testpassword host=x.x.x.x port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'
# replication slot on sending server
# This needs to be configured using the docs below for "Keeping the replica and primary in sync."
primary_slot_name = 'replica1'
hot_standby = on
# Allows query communication between reader and primary. Suggested to prevent any timeouts.
hot_standby_feedback = on
-
Add to the Mattermost config.
Within the Mattermost config.json, add the below to the
SqlSettings.ReplicaLagSettings
. The query will pull the diff between the current WAL and the replica. TheDataSource
string should point to your primary database.username
- Is the PostgreSQL username with permissions to themattermost
database. Usually, this ismmuser
.password
- Password assigned to theusername
from above.connectionString
- IP address or URL of the PostgresSQL instance. It doesn't require:5432
.
{ "SqlSettings": { "ReplicaLagSettings": [ { "DataSource": "postgres://username:password@connectionString/mattermost?sslmode=disable\u0026connect_timeout=10", "QueryAbsoluteLag": "select usename, pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) as metric from pg_stat_replication;", "QueryTimeLag": null } ] } }
-
Give permissions
Before restarting, give the
mmuser
access to the PostgreSQL monitoring role. Updating permissions should be done on the PostgreSQL primary database, not the replica.sudo -u postgres psql postgres=# GRANT pg_monitor TO mmuser;
-
Update chart and restart Mattermost.
In the
Mattermost Performance Monitoring v2
, update theReplica Lag
chart to use the valuemattermost_db_replica_lag_abs
instead of the current time value.The query should be the below:
mattermost_db_replica_lag_abs{instance=~"$server"}
-
Restart Mattermost
You should see this go into effect. Check the logs for any query errors.
sudo systemctl restart mattermost
If a read replica fails, when it attempts to recover, it will be unable to because the primary database will not have stored the WAL log it requires to sync back up. You'll see the below in the logs usually if this happens.
2022-03-25 04:38:27.369 UTC [77723] mmuser@mattermost FATAL: the database system is in recovery mode
...
2022-03-25 04:44:12.456 UTC [77947] LOG: entering standby mode
2022-03-25 04:44:12.465 UTC [77947] LOG: redo starts at 6/5FA18838
2022-03-25 04:44:12.969 UTC [77950] mmuser@mattermost FATAL: the database system is starting up
2022-03-25 04:44:12.971 UTC [77951] mmuser@mattermost FATAL: the database system is starting up
2022-03-25 04:44:13.574 UTC [77947] LOG: consistent recovery state reached at 6/6DA47060
2022-03-25 04:44:13.575 UTC [77946] LOG: database system is ready to accept read only connections
2022-03-25 04:44:23.290 UTC [77947] LOG: invalid resource manager ID 215 at 6/7236A070
2022-03-25 04:44:23.306 UTC [77975] LOG: started streaming WAL from primary at 6/72000000 on timeline 1
2022-03-25 04:44:23.306 UTC [77975] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000600000072 has already been removed
You can use WAL Archiving with recovery on the primary / read-replica. I did not explore this option in testing because it requires an additional setup. However, docs can be found on PostgreSQL's page for continuous archiving.
Replication slots allow the primary database to save the WAL archive for the read-replica until all transactions have been updated, confirmed then it gets flushed. Some information on this can be found here or here Effectively, this allows for an easy recovery on the read-replica if it crashes.
The only genuine concern with this option is that if your read replica goes offline for an extended period, the primary will store the WAL archive until the replica is back online. With the replica offline, it could consume a large amount of storage if offline for days/weeks as the primary stores the WAL archive.
- To view existing replication slots -
select * from pg_replication_slots;
- Check replication slot lag -
SELECT redo_lsn, slot_name,restart_lsn,
- Remove a replication slot -
select pg_drop_replication_slot(‘ocean’);
-
On the primary, you need to add a replication slot with the below. You can replace
replica1
with any unique value you want.select pg_create_physical_replication_slot(‘replica1’);
-
In the PostgreSQL config on the read replica, set the
primary_slot_name
value to the slot name above -
Edit the read replica
Postgres.conf
file to use the slot name.File location -
/etc/postgres/12/main/postgres.conf
primary_slot_name = 'replica1'
-
Restart the replica PostgreSQL service.
sudo systemctl restart postgresql
-
You can confirm this is running by running the below command on the primary PostgreSQL instance.
SELECT client_addr, state, pid, slot_name, active FROM pg_stat_replication INNER JOIN pg_replication_slots on pid = pg_replication_slots.active_pid;
You should see:
client_addr | state | pid | slot_name | active ---------------+-----------+--------+-----------+-------- x.x.x.x | streaming | 116774 | replica1 | t
This allows the primary/replica to communicate about queries happening and not cancel the read replica. This seemed to have the most overall benefit with the least amount of errors on the replica. They claim it adds primary system load, but it appears to be minimal at most.
hot_standby_feedback = on
This delay allows the standby server to wait for an update on the primary instead of canceling the query. This will reduce the overall errors in the database getting updated during a query. However, it can cause slower responses within Mattermost. Additionally, this still has some replica lag and some timed-out queries due to reaching the Mattermost max query time.
max_standby_archive_delay = 300s
max_standby_streaming_delay = 300s
restart_after_crash = on
This should be a last resort. If it's getting OOM killed, increase the server specs first. https://www.percona.com/blog/2019/08/02/out-of-memory-killer-or-savior/