Performing database migrations – or we can also call them schema migrations – is an essential task in almost every software project. There could be different reasons why updates to the database are required, some examples are:
- New features might require new attributes in existing tables or entirely new tables.
- Bug fixes might lead to changes in names or data types in the database.
- Performance issues might require additional indexes in the database.
While keeping track of software releases and updates is mostly very well implemented and supported using corresponding tools database migrations are often handled a bit novercal. This means either a lot of manual work is required to update the database in test and customer environments or some self-made tooling is used.
But there are good open source solutions available. Probably best known are Liquibase and Flyway. This article will cover Liquibase. Why? Well, the differences seem to be small and I am using Liquibase recently in a project :-).
Let’s start with some good news: Getting started with Liquibase is really easy and getting a first project up and running should be feasible really quickly.
We will start with the basic concepts of Liquibase to get an understanding what Changesets are and how those are applied to the database. Of course as a developer I would like to execute Liquibase using Maven while installations of test and customer environments will be typically done from the command line.
When using Liquibase with Maven one needs to follow the Maven directory structure as usual. This leads to some problems if we would like to use the same files from the command line (and of course we want that). It took me really some time and google-ing to find a solid solution for this. A good statement I found on one page while searching for a solution was “Maven is a bit unforgiven when it comes to its directory structure”. Funny enough the solution for executing Liquibase from the command line will be build using Maven as we will see later on.
In the end of the article some best practices are discussed. Probably those are not only relevant when using Liquibase. And then we are closing with a short summary.
The very basic concept here is called Changesets. Basically a Changeset can be seen as everything needed to migrate from one version of the database to another. This requires corresponding SQL-statements that are grouped and are then executed together.
Figure 1: Liquibase Changesets
One of the biggest advantages of Liquibase is that it keeps track of the database version (schema version) installed on the system. This is done by using an internal table that is automatically created on first use of Liquibase for that – and in that – specific schema. In the above example let’s assume we have two installations. The first one is completely empty and the second one is already on version 1. When starting Liquibase it will first run Changeset 1 and then Changeset 2 on the empty system. On the second system it will only run Changeset 2.
For the time being let’s assume we are having a Maven project already and all Liquibase-files are stored in “src/main/resources/liquibase”.
The starting point for Liquibase is its Properties-file. The file can have any name, but liquibase.properties seems to be a quite natural choice. Let’s take a look at its content.
The content of this file should be pretty clear. In this example a Postgres database system is used. The name of the driver must be given as well as a relative path to the JDBC driver-jar. It must be noted that the classpath is only used when starting Liquibase from the command line and not when it is started using Maven! But we have prepared the classpath-value already to work later on from the command line. This file does also contain the pointer to the initial Changelog-file. That file contains the Changesets to be executed and is thus the next file to have a closer look at.
<?xml version="1.0" encoding="UTF-8"?>
<include file="src/main/resources/liquibase/create_tables.sql" />
<include file="src/main/resources/liquibase/add_indexes.sql" />
<include file="src/main/resources/liquibase/alter_table_sample.sql" />
One note upfront: There are different possibilities to define Changesets. In this blog post we are using Liquibase Formatted SQL (we will see this later) and thus we are just including those SQL-files in our Changeset-file.
The changesets are applied in the order in which they appear in the Changelog-file. Basically there is not much more to say here as it looks pretty straightforward so far I would say :-). This leads to the Liquibase-formatted SQL-files. Let’s start with an example:
--liquibase formatted sql
CREATE TABLE SampleTable
id uuid NOT NULL,
CONSTRAINT sampletable_pkey PRIMARY KEY (id)
ALTER TABLE SampleTable
OWNER TO sample;
--rollback DROP TABLE SampleTable;
One can see that the lines starting with “–” is what makes the file a Liquibase-formatted SQL-file. Basically only very few additional entries are needed here.
--liquibase formatted sql
The above statement simply introduces this file as a Liquibase-formatted SQL-file.
As Liquibase is working on Changesets the name of the changeset introduced with this file must be defined. The part before the “:” is the author and the part after the “:” the name of the changeset itsself. It makes a lot of sense to use meaningful names and naming patterns here, e.g. including the release information into the names of the changesets.
Liquibase is keeping track of changes using the tables “databasechangelog” and “databasechangeloglock”. Those tables are created when Liquibase is executed on a schema for the first time. As the “author” and “id” from the Changeset form the unique id it is not possible to change for example the contents of a SQL-file that defines a changeset. This would simply result in an error during execution. It depends whether to use real user names for the author or a more technical one like “codecentric-docs” in the above example. I would prefer a more technical one as all files should be under version control anyway I would prefer looking up who has made which changes from the version control history of the corresponding files.
One really nice feature is the possibility to define rollback statements. If used properly it is possible to return to an earlier state of the database this way. It should be noted that a Liquibase-formatted SQL-file can contain several SQL-statements and thus several rollback statements.
That is basically all there is to know to get started. Of course the SQL in a real prtoject will be much more complex than the example above and especially altering tables later on might be difficult sometimes, but that is independent from using Liquibase. Now let’s take a look how to execute Liquibase.
First of all to use Liquibase we should add it as a dependency to our POM-file. The most recent releases can be found from here.
Then to make use of Maven to execute Liquibase the following plugin configuration must be added.
There is nothing too special here. Basically it is just a “pointer” to the Liquibase Property-file. As we have seen that file points to the Changelog-file which again contains the changesets. The execution-part of the plugin-configuration is common Maven functionality. It would be possible to execute it in some specific phase, but I found it better to have a specific goal for it.
Now it is possible to execute Liquibase from within an IDE or locally from the command line by starting Maven as follows:
That is basically it.
Command Line Execution
To perform installations on test-systems and productive environments it makes of course sense to perform them in the same way. Probably this should be done from the command line. Liquibase supports execution from the command line very easily, for example something like this:
java -jar ./liquibase-core-3.3.0.jar --driver=org.postgresql.Driver --classpath=./postgresql-9.2-1002-jdbc4.jar --changeLogFile=src/main/resources/liquibase/db.changelog.xml --url="jdbc:postgresql://localhost:5432/sampledb" --username=sample --password=topsecret update
In my case this can be found from a file called “liquibase.sh” stored in a “scripts”-subdirectory.
It can be seen that we need to give the path to the Changelog-file and in that file again the SQL-files have a relative path like “src/main/resources/liquibase/alter_table_sample.sql”. Executing it from an unpacked JAR is therefore not possible and I would anyway prefer having an individual artefact for the database installation.
The Liquibase-properties file is used to retreive basic values when it can be found from the same directory from which Liquibase is started.
Now to create a ZIP-file containing all relevant Liquibase-files we can use the Maven assembly-plugin. That is nothing specific to Liquibase, but for the sake of completeness the needed Maven-configuration is shown in the following as well. (You might need to adapt this to some extend to match your specific workspace configuration.)
First of all we need to configure the assembly-plugin in our project POM-file.
And then we have the following “src/assembly/liquibase_assembly.xml” that defines which files are collected and added to the generated ZIP-file.
Now we can generate the ZIP-file by executing:
Obviously it would make sense to better group files belonging to one database-release. Therefore the corresponding SQL-files should be stored to subdirectories that are named related to the releases. Luckily Liquibase also supports this in its Changelog-file by offering the possibility to include Changelog-files again to some kind of parent Changelog-file.
Let’s look at an example for this:
<?xml version="1.0" encoding="UTF-8"?>
The individual (referenced) Changelog-files then contain the entries to link to the corresponding SQL-files, which are then again stored now to corresponding – release-specific – subdirectories.
The following figure depicts the overall directory and file structure:
Figure 2: Best Practices
Creating a new release would now be done by writing a new Changelog-file for the new release and creating a new subdirectory containing the needed SQL-files. Then the new Changelog-file is added to the main Changelog-file and that’s it. (Maybe some testing would be a good idea still :-).)
This article can of course only cover parts of the overall Liquibase-functionality. But it can be seen that it supports the handling of schema-migrations very well. It can be easily executed using Maven in the development environment and at the same time an individual Artifact can be generated – from the same sources – that can be executed from the command line.
This allows also easy integration to some CI-environment like Jenkins. It is possible to re-execute Liquibase at any time and only changes are applied then. If there are no changes it simply performs no actions.
During development one might end up in situations that the database needs to be recreated from scratch if there have been problems (errors) in the Liquibase-scripts. Then one can just drop the database (in Development-environments at least ;-)) and start over again from scratch.
The learning curve is steep and first results are achieved quickly, which makes also an evaluation in a project easily possible without the need to invest too much time.