Solved

Postgres FS based - Best practice for log backups


Userlevel 1
Badge +5

Good morning,

 

We are new to postgres databases and are implementing our first production cluster soon.  What are you all using for your log backups in terms of scheduling?  We are running in continuous archive mode and currently have backups every 15 minutes.  Even 2G  backups are taking under a minute. 

What are the considerations for setting this? 

If our RPO is 5 minutes, would it make sense to backup every 5 minutes, or is that trying to fit Commvault into a high availability box that it is not designed for?

If using FSBasedBackups, are you also using regularly scheduled dump based backups?  In what scenario?

 

Thank you very much, this forum has been fantastic!

-Chris

icon

Best answer by Alex Deaconu 12 June 2024, 16:49

View original

2 replies

Userlevel 2
Badge +4

My personal recommendation is to run Log Backups every 15 minutes. You can lower the RPO to 5 minutes, but try and do it sparingly to only the most demanding PostgreSQL Databases, and make sure that the backups actually finish in under 5 minutes. That is because there is a little overhead when launching a backup job, and if you have a lot of databases, that will mean your CommServe will need to be really up to the task to keep up with many requests and modifying the MS SQL DB constantly. The more you can tune your CommServe DB, the more operations it can handle.

 

Depending on your environment, that might take a bit longer to launch and complete a job, even a TLog Backup, because of factors like CPU and Memory of both your client and your CommServe, networking latency between client and Media Agent, disk IOPS on both the client, the CommServe’s DB volume, and the Media Agent’s backup library, and more.

 

For PostgreSQL Clusters, you will need to modify the archive_command, and if you are already using that parameter to ship the logs to standby nodes, then you will need to modify it to also accommodate Commvault’s Log Backup requirements.

https://documentation.commvault.com/v11/expert/configuring_postgresql_archive_log_directory.html

 

It can be as easy as this.

archive_command = ‘cp %p /data/pg15/instance_5432_wal/%f && other-utility %p do-whatever-to-send-logs standby-note:/path/to/wal-read-location/%f’

 

FS Based Backups are different backup strategy than Dump Based Backups, and they work completely independently. You can run both. For large databases, it is better to run Dumps sparingly, as these take a big toll on the database.

 

With not that large databases, you can have a Daily Full FS Based Backup and Dump Based Backup, but try to not run them at the same time/schedule, and Incremental backups, which in PostgreSQL only means WAL Backups, as infrequent as 12 hours, or as low as 5 minutes, but my recommendation is generally every 15 minutes for large databases, and either one or four hours for the less utilized DB.

 

FS Based Backups just backup the PG Data Directory, using pg_start_backup() and pg_stop_backup(), then back up the WAL Files. When restoring, this allows you to Restore to a Point in Time, and can only be performed offline, when the DB is down.

 

Dump Based Backups use pg_dump on each database. Please make sure to enable Indexing V2, if it is not already enabled, and enable Global Objects Backups. These give you the ability to restore out of place when you have no clue what users and roles were present on the source DB, that means, in disaster recovery scenarios.

 

Dump Based Backups allow you to restore individual databases and tablespaces online, but they can only be restored to the time of the dump, meaning you will have a pretty low RPO on that, because you can only take dumps so often, if even once a day.

 

Some customers are completely unable to run Dump Based Backups, and some opt to not do that, because Dumps do not benefit much, if at all, from Deduplication in the Backup Library. Their solution to this ‘problem’ is to quickly stage a PostgreSQL Instance anywhere else, do a Point in Time Restore of the FS Based Backup Set, and quickly manually dump whatever Database or Tablespace that needs to be restored to that point in time out of the temporary staged DB, and then import it into the Production Database.

 

This can also be done with Commvault, assuming you go the extra mile of also installing Commvault on the Temporary Staged PostgreSQL Server. You can restore the PG Data and WAL files to disk and copy them to another host, but since Commvault makes it easy to do the restore, it is easier to just spin up a Commvault PostgreSQL Agent on the Temporary Server, run the restore through Commvault, and then make a new Dump Based Backup Set Subclient that only backs up the DB that you need, point it to a Storage Policy or Plan with very low retention, like One or Three Days Retention, take the dump, and then restore Out of Place from the Temporary Server to the Production Server.

 

Let me know if you have any questions.

 

 

Kind regards,

Alex

Userlevel 1
Badge +5

Thank you Alex, you put together a great ELI5 for me and I appreciate it.  We’ll be sticking with 15 minutes and FS backups with dumps primarily being on demand as opposed to regularly scheduled.

 

Thank you!

Reply