Question

PostgreSQL block level restore fails

  • 11 June 2021
  • 8 replies
  • 55 views

Badge +3

Hello,

Commvault version: v11.20.46

Source PostgreSQL OS: RHEL 7.7

Destination PostgreSQL OS: RHEL 7.7

Source PostgreSQL version: 12.4

Destination PostgreSQL version: 12.6

We are trying to out-of-place PostgreSQL individual db restore with our customer but we takes error;

‌Error Code: [94:19]
Description: PostgreSQL Database: [~ Command failed pg_ctl: could not start server Examine the log output. ~] Restore Failed with PostgreSQL Error: [~Starting of the aux server failed:: ~].
Source: <destination_server>, Process: POSTGRESBLKRESTORE

POSTGRESBLKRESTORE output;

28213 6e35 06/10 13:50:47 325986 CvProcess::system() - chown -R postgres  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170; su - postgres -c "chmod -R 700  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170"; su - postgres -c "/usr/pgsql-12/bin/pg_ctl start -w -t 43200  -D /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170 -l /opt/commvault/iDataAgent/jobResults/CV_JobResults/2/0/325986/pg_logfile_325986 -o --port=50792"
28213 6e35 06/10 13:50:56 325986 CvProcess::system() - Command completed with rc=1
28213 6e35 06/10 13:50:56 325986 PostgresBlockRestore::CreateNewPostGresServer() - Command failed [chown -R postgres  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170; su - postgres -c "chmod -R 700  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170"; su - postgres -c "/usr/pgsql-12/bin/pg_ctl start -w -t 43200  -D /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170 -l /opt/commvault/iDataAgent/jobResults/CV_JobResults/2/0/325986/pg_logfile_325986 -o --port=50792"]
28213 6e35 06/10 13:50:56 325986 PostgresBlockRestore::CreateNewPostGresServer() - system out = waiting for server to start........... stopped waiting
 err = pg_ctl: could not start server
Examine the log output.
 ret = 1
28213 6e35 06/10 13:50:56 325986 PostgresBlockRestore::CreateNewPostGresServer() - Command failed [chown -R postgres  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170; su - postgres -c "chmod -R 700  /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170"; su - postgres -c "/usr/pgsql-12/bin/pg_ctl start -w -t 43200  -D /opt/commvault/MediaAgent/SnapVolumeMounts/SnapMnt_1_2_632602_18_18170 -l /opt/commvault/iDataAgent/jobResults/CV_JobResults/2/0/325986/pg_logfile_325986 -o --port=50792"]
28213 6e35 06/10 13:50:56 325986 PostgresBlockRestore::CreateNewPostGresServer() - system out = waiting for server to start........... stopped waiting
 err = pg_ctl: could not start server
Examine the log output.
 ret = 1

/opt/commvault/iDataAgent/jobResults/CV_JobResults/2/0/325986/pg_logfile_325986 output;

user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:49.552 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  starting PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:49.553 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  listening on IPv4 address "0.0.0.0", port 50792
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:49.553 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  listening on IPv6 address "::", port 50792
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:49.553 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.50792"
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:49.588 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  listening on Unix socket "/tmp/.s.PGSQL.50792"
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:52.103 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  redirecting log output to logging collector process
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:52.103 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >HINT:  Future log output will appear in directory "/Log/log".

‌/Log/log/postgresql-10.log output;

user=,db=,app=,client=,user=,process=31695,time=2021-06-10 13:50:52.108 +03, commandtag=, statement=00000, session=60c1ee8c.7bcf, sessionstart=2021-06-10 13:50:52 +03, transactionid=0  >LOG:  database system was interrupted; last known up at 2021-06-10 11:37:14 +03
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000037.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000038.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000039.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003A.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003B.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003C.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003D.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003E.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003F.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000040.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000041.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000042.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000043.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000044.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000045.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000046.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000047.history’: No such file or directory
user=,db=,app=,client=,user=,process=31695,time=2021-06-10 13:50:56.368 +03, commandtag=, statement=00000, session=60c1ee8c.7bcf, sessionstart=2021-06-10 13:50:52 +03, transactionid=0  >LOG:  starting archive recovery
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000046.history’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000460000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000450000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000440000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000430000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000420000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000410000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000400000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003F0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003E0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003D0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003C0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003B0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/0000003A0000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000390000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000380000018D0000006A’: No such file or directory
cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/000000370000018D0000006A’: No such file or directory
user=,db=,app=,client=,user=,process=31695,time=2021-06-10 13:50:56.578 +03, commandtag=, statement=00000, session=60c1ee8c.7bcf, sessionstart=2021-06-10 13:50:52 +03, transactionid=0  >LOG:  restored log file "000000360000018D0000006A" from archive
user=,db=,app=,client=,user=,process=31695,time=2021-06-10 13:50:56.616 +03, commandtag=, statement=XX000, session=60c1ee8c.7bcf, sessionstart=2021-06-10 13:50:52 +03, transactionid=0  >FATAL:  requested timeline 70 is not a child of this server's history
user=,db=,app=,client=,user=,process=31695,time=2021-06-10 13:50:56.616 +03, commandtag=, statement=XX000, session=60c1ee8c.7bcf, sessionstart=2021-06-10 13:50:52 +03, transactionid=0  >DETAIL:  Latest checkpoint is at 18D/6A000110 on timeline 54, but in the history of the requested timeline, the server forked off from that timeline at 183/DF000000.
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:56.619 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  startup process (PID 31695) exited with exit code 1
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:56.619 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  aborting startup due to startup process failure
user=,db=,app=,client=,user=,process=31657,time=2021-06-10 13:50:56.647 +03, commandtag=, statement=00000, session=60c1ee89.7ba9, sessionstart=2021-06-10 13:50:49 +03, transactionid=0  >LOG:  database system is shut down

They was using script for delete old logs. Script is;

#!/bin/bash                                                                                                                                                                                                                                 
find /dbbackups/PGBackups/$HOSTNAME/xlogarchive/  -mtime +5 -exec mv {} /dbbackups/PGBackups/$HOSTNAME/silinecekler/xlogarchive/ \;                                                                                                         
rm -rf /dbbackups/PGBackups/$HOSTNAME/silinecekler/xlogarchive/*

 

I think this script deleted the history files so we are taking error now. I reinstalled Commvault agent from source and destination PostgreSQL servers and I took full backups and log backups. Why am I still seeing this error? How do I solve this, do we must put history file there?

I would be appreciated for your helps.

BR,

İbrahim


8 replies

Userlevel 2
Badge +3

Hi Ibrahim,

 

Looks like it is failing as it doesn't find any of the required logs during log recovery. 

cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000037.history’: No such file or directory

 

Can you please rerun the restore and check if /Restores/commvault/staging//<JOB ID>//WAL path has transaction logs restored? It will be cleaned up after the restore fails, so it needs to be checked while restore job is in progress

 

Please share below details also 
1. Data directory path and mount point. What is the data directory location, is it the mount point itself? 

2. Is the data directory or archive log directory on a symlink path? 

 

Regards,

Meera

Badge +3

Hi Ibrahim,

 

Looks like it is failing as it doesn't find any of the required logs during log recovery. 

cp: cannot stat ‘/Restores/commvault/staging//325986//WAL/00000037.history’: No such file or directory

 

Can you please rerun the restore and check if /Restores/commvault/staging//<JOB ID>//WAL path has transaction logs restored? It will be cleaned up after the restore fails, so it needs to be checked while restore job is in progress

 

Please share below details also 
1. Data directory path and mount point. What is the data directory location, is it the mount point itself? 

2. Is the data directory or archive log directory on a symlink path? 

 

Regards,

Meera

Hi Meera,

Thank you for reply. I hope you are well.

I checked it before. I am seeing log files in staging directory while restore job is in progress but I don’t see *.history files in there.

  1. Data directory location on another drive and fstype is xfs.
  2. No symlink.

By the way i did this restore my first try. I restored 7 or 8 databases successfully. But i took full backup from source server. After that full backup i couldn’t restore any databases again. Every time same issue. As I said before I think script (which they using for delete old logs) deleted history files and I took full backup after that. They put this script in post process box in subclient properties. It was too late when i saw it :)

Also we are using nDoNotBackupAllLogs additional settings on source server.

Regards,

İbrahim

Userlevel 2
Badge +3

Hi Ibrahim,

I could not recreate the same issue in my setup just by deleting the history files.

  1. What is this path - /dbbackups/PGBackups/$HOSTNAME/xlogarchive/ , isn’t it the archive log directory?
  2. When you ran new full backup, did you have the script? If yes, can you please remove the script and try a backup, restore?

Please recommend them to use archive delete option that commvault provides.

https://documentation.commvault.com/commvault/v11/article?p=21520.htm

We delete the logs that are backed up and removing the nDoNotBackupAllLogs key will avoid having any old file in the archive log location. 

 

Regards,

Meera

Badge +3

Hi Ibrahim,

I could not recreate the same issue in my setup just by deleting the history files.

  1. What is this path - /dbbackups/PGBackups/$HOSTNAME/xlogarchive/ , isn’t it the archive log directory?
  2. When you ran new full backup, did you have the script? If yes, can you please remove the script and try a backup, restore?

Please recommend them to use archive delete option that commvault provides.

https://documentation.commvault.com/commvault/v11/article?p=21520.htm

We delete the logs that are backed up and removing the nDoNotBackupAllLogs key will avoid having any old file in the archive log location. 

 

Regards,

Meera

Hi Meera,

This script deletes everything in /dbbackups/PGBackups/$HOSTNAME/xlogarchive/ directory older than 5 days. Not only history files.

  1. Yes it is archive log directory.
  2. No, i removed script from subclient properties.

I will talk them to enable archive delete option and take full backup. If they accept it, I will try to restore with new full backup.

I will inform you. Thank you.

Regards,

İbrahim

Userlevel 2
Badge +3

Hi @ibrahimemrekaya ,

 

If you had run new full after removing the script then restore should have completed. Do you see that log backup that runs after block level snap backup completed fine? If yes, please create a case with support.

 

Regards,

Meera

Userlevel 7
Badge +17

@ibrahimemrekaya , whenyou create an incident, please share the case number with me!

Badge +3

Hi,

I opened a case. Case number is: 210614-162.

Regards,

İbrahim

Userlevel 7
Badge +17

Thanks, @ibrahimemrekaya !  I’ll keep an eye on it :nerd:

Reply