Skip to content

Instantly share code, notes, and snippets.

@johndalton
Last active June 14, 2016 11:55
Show Gist options
  • Save johndalton/03dd695f24548b78727edb62fd9d889e to your computer and use it in GitHub Desktop.
Save johndalton/03dd695f24548b78727edb62fd9d889e to your computer and use it in GitHub Desktop.

#Remastering MySQL Replicas

The scenario here is that we have two replicas, A and B, which both share a master M. We want to make B a replica of A, so that the replication chain goes M > A > B. A prerequesite is that A needs to have binary logging enabled with log_slave_updates turned on, and A should not be receiving any local writes other than from replication.

In order to achieve the switch we need to have a single point in time at which we know both A's replicaton position relative to the master M, and A's own binary log position. Then we need to make B reach the exact same position relative to the master, at which point we can change master to point at the corresponding coordinates on A.

This procedure involves stopping and restarting replication on both replicas; very briefly on A, but a couple of times on B. It shouldn't be performed on hosts that are especially sensitive to replication delay. However even when carrying out this process by hand the delay shouldn't exceed more than a few minutes, and can usually be kept under one minute with careful planning.

##Step 1 On B, run:

STOP SLAVE;

We stop B first, as it has to be behind A's known position in order to be able to catch up to it.

##Step 2 On A, run:

STOP SLAVE;
SHOW SLAVE STATUS \G
SHOW MASTER STATUS \G
START SLAVE;

From the SHOW SLAVE STATUS output we must record the values of Relay_Master_Log_File and Exec_Master_Log_Pos. Let's call these Position M.

From the SHOW MASTER STATUS output we need the File and Position - we'll call these Position A.

Replication was only paused very briefly on A - just long enough to grab these two positions at the same point in time from the perspective of the replication stream (i.e. when no data was changing on A).

##Step 3 On B, where replication is still stopped, we now tell it to catch up:

START SLAVE UNTIL MASTER_LOG_FILE = *<Relay_Master_Log_File from Positon M>*, MASTER_LOG_POS = *<Exec_Master_Log_Pos from Position M>*;

This will probably only take a moment, depending on the volume of replication traffic. Once B is caught up, the output of SHOW SLAVE STATUS \G on B will show the same value for Exec_Master_Log_Pos as we recorded in Position M. Check that carefully, as if it's not caught up then the next step will break replication on B.

##Step 4 On B, once you've verified that it's reached Position M, we now point it at A instead:

STOP SLAVE;
CHANGE MASTER TO MASTER_HOST […and so on - we need all the values when changing hosts!], MASTER_LOG_FILE = *<File from Positon A>*, MASTER_LOG_POS = *<Position from Position A>*;
START SLAVE;

You need to provide the full details of the CHANGE MASTER command as appropriate to your environment.

##Step 5

You should now be able to verify that this is working by comparing the output of SHOW SLAVE STATUS \G on B with the position of SHOW MASTER STATUS on A. It's not necessary to stop replication, just to see that B continues to advance. The most complete test is to modify data on the master M (e.g. create a new test table) and verify that the change reaches B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment