This is a proposal for dealing with database migrations when using git (or similiar). The problem with migrations is ensuring that the current state of the database is in sync with the code-base. In practice, this means being able to guarantee that the state of the database correctly reflects the version of the code checked out.
The correct way to build the database schema is to run all db migrations from zero. But as the project grows, this takes longer and longer to execute and quickly becomes inpractical.
To avoid having to re-run all the migrations, it makes sense to have a base sql that avoids running earlier migrations that are relevant to the currently deployed code.
With that in mind, the following is proposed:
-
When there is a deploy, take a dump of the newly updated live schema.
-
Update the current base.sql file with this updated dump.
-
Archive the previous migrations or even reset them to zero. (Purists may scoff at this, but the reality is that once you have successfully deployed, if you need to revert back to the previous deploy, you already have a copy of the correct sql schema.
-
Rinse and repeat.
Points 1, 2 and 3 are automatable. By doing this we can guarantee the correct database schema for unit testing, checked-out branches in local and QA server environments and in production.
A pragmatic approach.
- each developer updates a db_changes.sql file
- this file is used as part of the deploy process
- after deploy, the new schema is used for the base.sql file
- the db_changes.sql file is cleared.
- GOTO 1
I'd like to add that since everything is in git, nothing is really lost and if for some reason an individual migration needed to be cherry-picked, it could be, but the likelihood of that happening is ultra-low.