PlannedReparentShard

starts here: https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/vtctl/reparent.go#L98
DemoteMaster: https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/wrangler/reparent.go#L423
1. Set serving = false in vt topo: https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/vttablet/tabletmanager/rpc_replication.go#L305
  1. SetServingType https://github.com/dasl-/vitess/blob/master/go/vt/vttablet/tabletserver/tabletserver.go#L470
2. Set RO
3. FLUSH TABLES WITH READ LOCK (why? this wait for long running selects) https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/mysqlctl/reparent.go#L97-L104
4. UNLOCK TABLES
5. SELECT @@GLOBAL.gtid_executed
PromoteSlaveWhenCaughtUp https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/wrangler/reparent.go#L435 Note that the timeout for this is set here: https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/wrangler/reparent.go#L428 By default it is 30s: https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/topo/locks.go#L53
1. Wait for replication to catch up: SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('%s', %v) . The timeout is https://github.com/dasl-/vitess/blob/managed-tablet/go/mysql/flavor_mysql.go#L134
2. SET @@global.read_only = false
3. Change topo type to master
4. agent.refreshTablet https://github.com/dasl-/vitess/blob/master/go/vt/vttablet/tabletmanager/rpc_replication.go#L459
  1. agent.updateState https://github.com/dasl-/vitess/blob/master/go/vt/vttablet/tabletmanager/state_change.go#L157
  2. agent.broadcastHealth (let vtgate know we're serving) https://github.com/dasl-/vitess/blob/master/go/vt/vttablet/tabletmanager/state_change.go#L341
Insert row into the reparent journal table on the new master. https://github.com/dasl-/vitess/blob/f23777db959c7bb6be128d091e81f47e04bbc2a8/go/vt/wrangler/reparent.go#L473
reparent all replicas to the new master. https://github.com/dasl-/vitess/blob/managed-tablet/go/vt/wrangler/reparent.go#L460
1. Run CHANGE MASTER TO ... query. https://github.com/dasl-/vitess/blob/f23777db959c7bb6be128d091e81f47e04bbc2a8/go/vt/vttablet/tabletmanager/rpc_replication.go#L540
2. Wait for row inserted into master's reparent journal to replicate. https://github.com/dasl-/vitess/blob/f23777db959c7bb6be128d091e81f47e04bbc2a8/go/vt/vttablet/tabletmanager/rpc_replication.go#L563
UpdateShardFields in topology https://github.com/dasl-/vitess/blob/master/go/vt/wrangler/reparent.go#L502

Docs say: -wait_slave_timeout duration time to wait for slaves to catch up in reparenting (default 30s)

waitSlaveTimeout is used for

searching for best new master candidate (which has executed most transactions)
reparenting all replicas to the new master after the new master has been enabled.

I'd expect it to also be used for timing out PromoteSlaveWhenCaughtUp. PR to do so has been merged: https://github.com/vitessio/vitess/commit/baa15c2571257ed2bd00cc9f38f0d748eeeaa6d2

We had an incident recently using PlannedReparentShard: https://github.etsycorp.com/gist/dleibovic/b666ad2c0d3a1f9ec612f413c4adfa18

dasl-/PlannedReparentShard Audit.md

PlannedReparentShard