Overview

A Developer’s View on the PostgreSQL Database

No Comments

Basically every project is using a database as one of its central components. At least I cannot remember a single project from the last 15 years that has not been using one. And even though NoSQL databases are on the rise, relational databases – like PostgreSQL – are still well alive and widely utilized in projects.

For a developer interaction with the database is mainly happening through SQL. This is done from a SQL-shell or – most of the time – using some graphical tool. For the programming part this is further hidden behind some language API. In Java database access is often “happily” encapsulated behind a JPA-layer (to be honest my feelings towards JPA are a bit biased, but that is a different story).

More than SQL

Probably developers can well survive projects by only looking at the SQL-part. But gaining a deeper understanding on the used database might help in improving the performance (of the application and of the development work) and potentially avoiding problems during operation of the application. In addition it will help in support cases handled together with the ops-team. Hey, we do not want to stand stupid there in such a case ;-).

The idea of this blog post is to get some more insight on working with the PostgreSQL database. This will not make you a database administrator, but hopefully some of the things mentioned here will come in handy at times.

Even though it is convenient – and obviously useful – to utilize some graphical tool to access the database one should also know how to do this from the command line using psql. This does not only enable us to perform SQL-queries directly on servers we might only have SSH-access to. It is also a good source on further information on our PostgreSQL installation.

One command that often comes in handy is the following to find out the location of the PostgreSQL configuration file.

psql postgres
postgres=# SHOW config_file;
                config_file
 -----------------------------------------
  /usr/local/var/postgres/postgresql.conf
 (1 row)

Every now and then one might want to check some configuration – also in the local development environment – and this makes it easy to find the configuration quickly on different operating systems (e.g. when supporting colleagues).

Using the SHOW-command some more useful information on the PostgreSQL installation can be retrieved.

postgres=# SHOW server_version;
  server_version 
 ----------------
  9.4.5
 (1 row)

Or simply retrieve all possible settings using the SHOW all command (have fun :-)).

Useful PostgreSQL Meta-Commands

There is a set of command starting with a backslash (\) that are called meta-commands in PostgreSQL. The \l-command for example shows information on all databases in PostgreSQL.

postgres=# \l
                                            List OF DATABASES
      Name      |     Owner     | Encoding |   COLLATE   |    Ctype    |        Access privileges        
---------------+---------------+----------+-------------+-------------+---------------------------------
  flywaydemo    | flywaydemo    | UTF8     | de_DE.UTF-8 | de_DE.UTF-8 | 
  liquibasedemo | liquibasedemo | UTF8     | de_DE.UTF-8 | de_DE.UTF-8 | 
  postgres      | postgres      | UTF8     | de_DE.UTF-8 | de_DE.UTF-8 |

Digging into the currently opened database in psql one can use the \d command. This shows all objects of the database:

liquibasedemo=# \d
                   List of relations
 Schema |         Name          | Type  |     Owner     
--------+-----------------------+-------+---------------
 public | databasechangelog     | table | liquibasedemo
 public | databasechangeloglock | table | liquibasedemo
 public | liquibase_test        | table | liquibasedemo
 public | liquibase_test_view   | view  | liquibasedemo
(4 rows)

The same command can be used to show information on specific database objects as can be seen in the following:

liquibasedemo=# \d liquibase_test_view
      View "public.liquibase_test_view"
 Column |          Type          | Modifiers 
--------+------------------------+-----------
 key    | character varying(64)  | 
 value  | character varying(255) | 

If there is a need to repeat the last SQL-command the \g command can be used. It repeats the last SQL command. Assuming you have created a complex SQL-query (no, the following example-query is not complex :-)) and you want to save that query for later use you can also combine this with the \w command. This writes the command to the given file.

liquibasedemo=# select * from databasechangelog;
                   id                    |     author     |                               filename                               |        dateexecuted        | orderexecuted | exectype |               md5sum               | description | comments | tag | liquibase | contexts | labels 
-----------------------------------------+----------------+----------------------------------------------------------------------+----------------------------+---------------+----------+------------------------------------+-------------+----------+-----+-----------+----------+--------
 release_1_0_0.001_create_test_table.sql | liquibase-demo | src/main/resources/liquibase/release_1_0_0/001_create_test_table.sql | 2016-10-14 08:52:07.066734 |             1 | EXECUTED | 7:bf42d0f71928b1ee7e0118a4c2f3ef3c | sql         |          |     | 3.4.2     |          | 
 release_2_0_0.002_create_test_view.sql  | liquibase-demo | src/main/resources/liquibase/release_1_0_0/002_create_test_view.sql  | 2016-10-14 08:52:07.075526 |             2 | EXECUTED | 7:6b9e4c5c279b8998a71b0566a8c65d46 | sql         |          |     | 3.4.2     |          | 
 release_2_0_0.01_create_test_view.sql   | liquibase-demo | src/main/resources/liquibase/release_2_0_0/001_create_test_index.sql | 2016-10-14 08:52:07.078106 |             3 | EXECUTED | 7:6b9e4c5c279b8998a71b0566a8c65d46 | sql         |          |     | 3.4.2     |          | 
(3 rows)

liquibasedemo=# \g \w command.sql
                   id                    |     author     |                               filename                               |        dateexecuted        | orderexecuted | exectype |               md5sum               | description | comments | tag | liquibase | contexts | labels 
-----------------------------------------+----------------+----------------------------------------------------------------------+----------------------------+---------------+----------+------------------------------------+-------------+----------+-----+-----------+----------+--------
 release_1_0_0.001_create_test_table.sql | liquibase-demo | src/main/resources/liquibase/release_1_0_0/001_create_test_table.sql | 2016-10-14 08:52:07.066734 |             1 | EXECUTED | 7:bf42d0f71928b1ee7e0118a4c2f3ef3c | sql         |          |     | 3.4.2     |          | 
 release_2_0_0.002_create_test_view.sql  | liquibase-demo | src/main/resources/liquibase/release_1_0_0/002_create_test_view.sql  | 2016-10-14 08:52:07.075526 |             2 | EXECUTED | 7:6b9e4c5c279b8998a71b0566a8c65d46 | sql         |          |     | 3.4.2     |          | 
 release_2_0_0.01_create_test_view.sql   | liquibase-demo | src/main/resources/liquibase/release_2_0_0/001_create_test_index.sql | 2016-10-14 08:52:07.078106 |             3 | EXECUTED | 7:6b9e4c5c279b8998a71b0566a8c65d46 | sql         |          |     | 3.4.2     |          | 
(3 rows)

liquibasedemo=# \q
Thomass-MacBook-cc:dev-scripts thomasjaspers$ more command.sql 
select * from databasechangelog;

Writing the SQL-command to a file obviously leads to the question how to write the result of a SQL-query to a file. This can be done using the \copy command. This one is having a bit more complex syntax, which is also varying a bit depending on the used PostgreSQL release. The following works for release 9.4.5.:

liquibasedemo=# \copy (select * from databasechangelog) TO '/tmp/result.csv' with (format csv, header, delimiter ',');
COPY 3
liquibasedemo=# \q
Thomass-MacBook-cc:dev-scripts thomasjaspers$ more /tmp/result.csv 
id,author,filename,dateexecuted,orderexecuted,exectype,md5sum,description,comments,tag,liquibase,contexts,labels
release_1_0_0.001_create_test_table.sql,liquibase-demo,src/main/resources/liquibase/release_1_0_0/001_create_test_table.sql,2016-10-14 08:52:07.066734,1,EXECUTED,7:bf42d0f71928b1ee7e0118a4c2f3ef3c,sql,"",,3.4.2,,
release_2_0_0.002_create_test_view.sql,liquibase-demo,src/main/resources/liquibase/release_1_0_0/002_create_test_view.sql,2016-10-14 08:52:07.075526,2,EXECUTED,7:6b9e4c5c279b8998a71b0566a8c65d46,sql,"",,3.4.2,,
release_2_0_0.01_create_test_view.sql,liquibase-demo,src/main/resources/liquibase/release_2_0_0/001_create_test_index.sql,2016-10-14 08:52:07.078106,3,EXECUTED,7:6b9e4c5c279b8998a71b0566a8c65d46,sql,"",,3.4.2,,

Vacuum and Autovacuum

This is a feature that is probably much closer to the administration of the database that it is to development. Nevertheless it might be good to have at least some clue on this as a developer.

Background here is that – basically every database – is keeping different versions of the data stored to tables to be able to allow writing and reading to and from tables without locking those all the time. But this leads to the situation that several versions of the data are stored in parallel, where only one of these versions is the most recent one. So we need to remove the outdated versions of the data and this is where (auto)vacuuming comes into play.

Let’s start with an example to get an idea what we are talking about:

liquibasedemo=# VACUUM (VERBOSE, ANALYZE) databasechangelog;
INFO:  vacuuming "public.databasechangelog"
INFO:  "databasechangelog": found 0 removable, 3 nonremovable row versions in 1 out of 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_1348126"
INFO:  index "pg_toast_1348126_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_1348126": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.databasechangelog"
INFO:  "databasechangelog": scanned 1 of 1 pages, containing 3 live rows and 0 dead rows; 3 rows in sample, 3 estimated total rows
VACUUM

It seems that vaccum has not much to do on that table. This is not a surprise because there are hardly any rows in that table and no frequent read/writes are happening. Vacuuming is especially important for tables that are frequently updated. This is a task that should be discussed together with the database administrator(s) as it might be that additional vacuum-commands should be scheduled for such tables in addition to the autovacuum.

“The answer to almost all vacuuming problems is to vacuum more often, not less, so that each individual vacuum operation has less to clean up.” – from PostgreSQL Wiki

The vacuum-command is documented here. As the name indicates autovacuum is running periodically to perform vacuum-operations on the different database. Here is a link to the documentation. There are quite some configuration options here and complex projects probably need some tweaking of the default configuration.

As already mentioned before this feature goes way beyond normal development tasks. But discussing the strategy for vacuuming with the ops-team might be a good idea. This is also true for having an idea that such a feature exists at all :-).

Some final thoughts

Probably this article could go on forever :-). One very interesting topic – even though in goes into a slightly different direction – is for example partitioning. This allows storing/reading data to/from so-called child-tables in a quite transparent way to increase performance for tables storing huge amounts of records. Could be worth trying this out in another blog post, even though there are already quite a lot of articles on this feature available :-).

Hopefully some of the things shown here might help a bit in the daily work with the PostgreSQL database. Of course feedback/comments on more useful features are very welcome.

Comment

Your email address will not be published. Required fields are marked *