Recently in the project I work in we've encountered a major database branching issue for the first time. We are using branch for release branching strategy, meaning that we do our current development in the trunk and branch every time the product is released. Our branches are just for fixing critical bugs that can't wait until next release. One of the bug fixes we needed to apply involved schema change and the problem was that we went ahead with development in our trunk so that the bugfix update script differs for production branch and trunk.
We're doing our database versioning with RoundhousE using forward-only, run-once, irreversible update scripts, in case of rollbacks we are restoring our databases from backups. Our tooling ensures that no script is modified after being run, which makes sense as we have no way to apply those changes to the database that is already some revisions ahead. We also don't want to have scripts that are branch-specific, as we'll need to skip this script on merging and we'll need to remember about that until the last day of existence of our product. What's more, if we have our development environment built using different set of scripts than the production one, we are asking for trouble.
Before we decided what to do, we've thoroughly discussed an article by K. Scott Allen from 2008. There were two solutions proposed - either to include the patching script before all the new scripts from the trunk (meaning that the databases already in trunk version need to be fixed somehow) or to have two different scripts in two branches written in such way that the script itself ensures that it is not run twice, so that it can be merged through branches.
I don't like the second option, which was recommended by Scott. It suits to our tooling and will work, but going that way means that production database was built a bit differently than development ones (as our patch script was branched - there are some statements that must have been skipped to make the script run correctly both in prod and dev). That is smelly. Even if we can see that the result seems to be the same, we'd prefer to have all our databases built using exactly the same set of scripts in the same order.
Scott discourages the first option - with inserting the patch script before all the trunk scripts - as it means applying the changes to the database that is already forward. But again - we want our databases to be built using exactly the same set of scripts in the same order. This means that if our production database will have the patch applied before the scripts that are already in trunk (and will go to production in some future release) - we should have the same order in the development databases.
Here is our final solution - it's a bit different than these two:
- Integrate the patch before all the trunk scripts. Let's say that the branch was done after script 100 so that the production database is at version 100 and we have new scripts 101 and 102 in the trunk so that our development databases are currently in version 102. This means that our patch needs to go between 100 and 101 - let's say 100a.
- Modify 101 and 102 to be runnable on the new schema (changes should not be needed in most cases as 100a is just a bugfix and should not consist major changes as such).
- Roll back all the development databases to version 100 from the backup, so that the next upgrade will run 100a first and then 101 and 102. In case someone will not roll back the database, the next local deployment will fail on running 100a script on the database already at ver. 102 and this is good as it requires every developer to have a production-like environment.
The only issue with this approach is that because of restoring the database from the backup we're losing some of the newest data. But this is probably not a big deal in the development environment. And knowing that all our databases (development, production and whatever) were upgraded by the same sequence of statements lets us sleep better.