Skip to main content

Hello Community,

 


Issue: The Oracle backup completed successfully. But, the RMAN log gave an error as follows: "ORA-19502: write error on file '10684186_QUADMS_7024upnk_224_1_1', block number 11883521 (block size=16384)."

 

Tried :

conducted an examination and can confirm the there is no real I/O errors on the file system through the command "SELECT * FROM V$DATABASE_BLOCK_CORRUPTION;," which yielded no search results.

verified the file system and storage have an ample availability of space.

Question:

If CV backup history shows as "complete successfully", does that imply everything is in order, even though there was an RMAN write error during the middle of the backup process?

I'm also interested in understanding how Commvault assesses the integrity to ensure the restorability of the ora backup, even in cases where the process concludes successfully despite encountering several write errors during the middle of the backup.
 

 

thanks

Hi @DanC 

Could you attach the entire rman log so I can review and confirm back if the backup is good

 

In general during RMAN backups if we run into any errors rman will retry and protect all the datafiles and the job will be marked as completed. If we run into a hard error the job goes into pending and it will retry again and in the next attempt if the backup runs without any errors the job is marked as completed and we are good with the restores.

 

Additionally double click on the job and review the attempts tab which shows number of attempts (if any) before the job is complete.

 

You may also use the RMAN validate restore option to simulate the restore without actually restoring the entire date.

https://documentation.commvault.com/2023e/expert/21104_validating_oracle_rac_restores.html

 

Let me know in case of any other questions.

 

Regards,

Gowri Shankar

 

 


@Gowri Shankar  thanks and i will collect rman log and attach here


@Gowri Shankar 

 

The attempts tab indicates that the execution of the RMAN script initially failed due to a write error. However, the backup was resumed after a 16-minute interval and completed successfully. It's possible that a brief network issue might have been the cause of the error.
 

Upon reviewing the RMAN log, I noted only one failure in the initial attempt, which then successfully completed after a retry, am i correct ?

Questions (attached rman log):

In the log, there're two separate scripts being run after failed attempt. The first script appears to perform a full backup (incremental level = 0), encompassing the database, controlfile, and SPfile. Upon its completion, the second RMAN script is auto-executed for the same backup job to perform a backup of the archived redo logs.


does this sequence seems to be expected behavior for ora full backup, wherein the full backup captures all database-related components, followed by the capture of changes through the backup of archived redo logs.

the rman script doesn't explicitly indicate whether this is an online or offline full backup. am i missing something or how to find out this info ?

thanks


Hi @DanC 

Your observations and findings are spot on.

 

The first attempt failed and the job went into pending 

 

RMAN log cuts 

 

channel ch1: starting piece 1 at Aug 28 2023 12:26:28
released channel: ch1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch1 channel at 08/28/2023 12:57:24
ORA-19502: write error on file "10684186_QUADMS_7024upnk_224_1_1", block number 11883585 (block size=16384)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
SBT error 0 in function sbtwrite2 - sbterror did not return error message
ORA-19502: write error on file "10684186_QUADMS_7024upnk_224_1_1", block number 11883521 (block size=16384)
RMAN>

 

This could happen due to a network issue or an issue writing data to the backup media and the job went into pending and auto retry resumed and completed without any issues. The backup is valid and good for restores.

 

channel ch1: backup set complete, elapsed time: 00:05:45
channel ch1: starting incremental level 0 datafile backup set
channel ch1: specifying datafile(s) in backup set
input datafile file number=00094 name=/quadms/data/tem_ts1_02.dbf
channel ch1: starting piece 1 at Aug 28 2023 15:31:47
channel ch1: finished piece 1 at Aug 28 2023 15:33:02
piece handle=10684186_QUADMS_7b24v4j3_235_1_1 tag=TAG20230828T131349 comment=API Version 2.0,MMS Version 11.0.0.80
channel ch1: backup set complete, elapsed time: 00:01:15
Finished backup at Aug 28 2023 15:33:02
Starting Control File and SPFILE Autobackup at Aug 28 2023 15:33:02
piece handle=c-3305395391-20230828-01 comment=API Version 2.0,MMS Version 11.0.0.80
Finished Control File and SPFILE Autobackup at Aug 28 2023 15:33:10
released channel: ch1
RMAN>

 

 

Yes, it is expected behavior to run data and logs in two different RMAN run blocks. 

 

The job manager log would indicate the type of job and following rman log cuts to identify the job type 

 

allocate channel ch1 type 'sbt_tape'
PARMS="SBT_LIBRARY=/opt/commvault/Base/libobk.so, BLKSIZE=1048576 ENV=(CV_mmsApiVsn=2,ThreadCommandLine= -cn torbmenvldbo31 -vm Instance001)"
TRACE 0;
send "BACKUP -jm 32813 -a 2:6780 -cl 10098 -ins 1476 -at 22 -j 10684186 -jt 10684186:4:1:0:0:21444 -bal 0 -t 1 -ms 1 -data  -useCvNwSrv";
setlimit channel ch1 maxopenfiles 8;
backup
incremental level = 0
filesperset = 8
format='10684186_%d_%U'
database
include current controlfile  ;
}

 

Incremental level 0 indicates that its a full database backup

 

 

Rman Log:>
Recovery Manager: Release 19.0.0.0.0 - Production on Mon Aug 28 09:54:10 2023
Version 19.19.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
RMAN>
connected to target database: QUADMS (DBID=3305395391)
using target database control file instead of recovery catalog
RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13>
allocated channel: ch1

 

The above line which prints connected to target database in case if its a offline backup you will see the database mode like Mounted 

In this case the database connection was successful which indicate its a online backup

 

Let me know in case of any other questions.

 

Regards,

Gowri Shankar

 


thank you very much @Gowri Shankar 

 


Hi Gowri, 

I am facing same issue while runing backup for my datagurad instance. multiple backups ran all are having same issue. I check on network there is no network packet drop. Checkreadiness is working as expected. Server rebooted but still same issue.


Reply