Easy and fast backup (and recovery!) for huge PostgreSQL instances at AWS

Usually you have several options when you want to backup PostgreSQL instances. First stop would be the offical doc. But at some point dumping data, or stopping the instance would not be enough. Of course, you could also go for the incremental backups using the Write Ahead Log (WAL).

But what if you want fast backups, with fast recovery, and your instance is really huge?

Our problem

We already have one instance with 85TB at AWS. But even before we reached this size, we needed to perform daily backups (it was acceptable to loose the last 24 hours), as fast as possible, without interruptions, and fast recovery.

So here comes one of the good things about AWS. You can backup EBS volumes by launching snapshots. Once a snapshot is started (it takes a couple of seconds at most) you can keep working usual.

Consistency!

The only problem is that you need to ensure consistency, at both filesystem level and PostgreSQL level. Specially if you’re using several logical volumes (with) and with several RAID0 disks for each logical volume.

If you just launch the EBS snapshots, it will be the same as if you disconnect power on a physical server and then you restart (the restart would be the backup). You would need to trust that the filesystem journaling system and PostgreSQL WAL would be able to return everything to a consistent state.

Sure, most times that should work. But even if it does, it would take ages for PostgreSQL to check the databases and perform the rollbacks. And if it doesn’t work… you’re in serious trouble.

XFS to the rescue

It was clear we needed some way to flush all pending operations to the filesystem, and freeze it while the snapshots were launched.

XFS is one of the oldest journaling filesystem. And despite the support arrived late to some GNU/Linux distribution, it’s now supported by CentOS/RHEL6 and the default choice for CentOS/RHEL7 (which means it can be also used with Amazon Linux).

Among other features, there’s one quite interesting. You can “freeze” the filesystem. This will flush all pending write operations and will prevent further changes to the volume until you “unfreeze”.

So finally we could freeze (xfs_freeze -f <device> for each device), launch the snapshots, and then unfreeze (xfs_freeze -f <device> for each device too)

If you remember, you don’t need to wait for the snapshots to be complete before using the EBS volumes again. So if you are fast launching the snapshots for all the EBS volumes you may have (60 in our case), your system will be frozen less than 2 seconds (we use ebs-tools I wrote time ago, so we can launch all 60 snapshots in parallel).

Caveats

I must confess that I was not able to find information on what exactly happens to the transactions that may arrive while the logical volumes are frozen, but most probably they are queued into memory. So if you have a lot of transactions per second and you freeze the volumes too long, it’s clear that you’ll loose transactions.

I Must say that never happened in our case. As I said we just freeze for less than 2 seconds and usually we don’t have a lot of traffic at night. If this not your case, you’ll need several PostgreSQL instances with a replication system, and use an slaves.

The other “problem” is that you should take care for the first time you make the snapshots. You can launch it as usual, but as it will make a complete backup for all the content you have a the volumes it will take some time. Just be sure you don’t launch more snapshots until the first set is complete.

Is it working?

You can bet it is!

We have automated tests in place, so we can test a backup from time to time, to check if we would be able to recover if we need it. And no problems so far! 🙂

It’s just possible to launch a new EC2 instance (same virtualization type as the original, HVM or PVM), detach the volumes it has, convert snapshots to EBS, attach restored EBS with the same attach points, and finally start the EC2 instance.

If you are able to automate this (as we did), you can restore a backup and start working again in less than 60 minutes without human intervention. Not bad for a 80TB backup.

This is even useful if you want to recover a table, schema or database inside the instance. Restore a backup, and then use pg_dump as needed without affecting the live instance where your customers are working.

Cost

You may think we’re crazy. 80TB of snapshots each day!

But just remember that EBS snapshots are incremental, compressed at S3. So your cost will depend on how much data you’ll change between each backup, and how Amazon is able to compress it. There’re a lot of questions at AWS forums about it (see for example https://forums.aws.amazon.com/thread.jspa?messageID=622686)

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.