25.6. Incrementally Updated Backups

In a standby configuration, it is possible to offload the expense of taking periodic base backups from the primary server; instead base backups can be made by backing up a standby server's files. This concept is generally known as incrementally updated backups, log change accumulation, or more simply, change accumulation.

If we take a file system backup of the standby server's data directory while it is processing logs shipped from the primary, we will be able to reload that backup and restart the standby's recovery process from the last restart point. We no longer need to keep WAL files from before the standby's restart point. If recovery is needed, it will be faster to recover from the incrementally updated backup than from the original base backup.

The procedure for taking a file system backup of the standby server's data directory while it's processing logs shipped from the primary is:

  1. Perform the backup, without using pg_start_backup and pg_stop_backup. Note that the pg_control file must be backed up first, as in:

    cp /var/lib/pgsql/data/global/pg_control /tmp
    cp -r /var/lib/pgsql/data /path/to/backup
    mv /tmp/pg_control /path/to/backup/data/global

    pg_control contains the location where WAL replay will begin after restoring from the backup; backing it up first ensures that it points to the last restartpoint when the backup started, not some later restartpoint that happened while files were copied to the backup.

  2. Make note of the backup ending WAL location by calling the pg_last_xlog_replay_location function at the end of the backup, and keep it with the backup.

    psql -c "select pg_last_xlog_replay_location();" > /path/to/backup/end_location

    When recovering from the incrementally updated backup, the server can begin accepting connections and complete the recovery successfully before the database has become consistent. To avoid that, you must ensure the database is consistent before users try to connect to the server and when the recovery ends. You can do that by comparing the progress of the recovery with the stored backup ending WAL location: the server is not consistent until recovery has reached the backup end location. The progress of the recovery can also be observed with the pg_last_xlog_replay_location function, but that required connecting to the server while it might not be consistent yet, so care should be taken with that method.

Since the standby server is not "live", it is not possible to use pg_start_backup() and pg_stop_backup() to manage the backup process; it will be up to you to determine how far back you need to keep WAL segment files to have a recoverable backup. That is determined by the last restartpoint when the backup was taken, any WAL older than that can be deleted from the archive once the backup is complete. You can determine the last restartpoint by running pg_controldata on the standby server before taking the backup, or by using the log_checkpoints option to print values to the standby's server log.