Getting the following when CommVault attempts to reconstruct the DDB
|
Is there any way around this error so I can get backups going again? It has tried a couple of times for multiple days.
Getting the following when CommVault attempts to reconstruct the DDB
|
Is there any way around this error so I can get backups going again? It has tried a couple of times for multiple days.
Hi
You COULD seal the store, though that will result in a new baseline being created (and a lot more space used up).
What we should do first is look at the DDBRecovery.log on the Media Agent to see why the job is failing and see if we can address that instead. Likely a much easier fix overall.
Thanks Mike, I will hit you up in a PM to get my name adjusted. Appreciate that. I took a look at the logs and this is what I see for the failure last night. It is reprocessing again but I expect it to fail again with the same error. I checked and verified I can reach all the partitions:
4944 86c 11/30 03:32:26 353464 Reader [3] <Copy/Stream> Source <15/3>: Reporting SUCCESS to AuxcpyMgr, Err [0/0]. Chnk [6989717], bytes copied [4293963918]
4944 86c 11/30 03:32:26 353464 +++ AuxCopy Thread Params +++
4944 86c 11/30 03:32:26 353464 Storage Pol. [BTR_Global_Dedupe] ID [13]
4944 86c 11/30 03:32:26 353464 Src Copy [1. Primary] ID [15] Is Dedup Copy [1]
4944 86c 11/30 03:32:26 353464 Src Stream [3] MediaGrpId [856]
4944 86c 11/30 03:32:26 353464 Src DrvPoolId [1] Type [10001]
4944 86c 11/30 03:32:26 353464 +++ Src Chnk Info +++
4944 86c 11/30 03:32:26 353464 Src ChnkId [6989719], VolId [287104]
4944 86c 11/30 03:32:26 353464 CCId [2], File Mrkr No. [1929]
4944 86c 11/30 03:32:26 353464 Chnk Size Phy. [2548587826], Logical [2769008028]
4944 86c 11/30 03:32:26 353464 No. of ArchFiles [1]
4944 86c 11/30 03:32:26 353464 isSDM [0] isNASIndex [0]
4944 86c 11/30 03:32:26 353464 +++ Arch File Info +++
4944 86c 11/30 03:32:26 353464 1. AFileId [663022]
4944 86c 11/30 03:32:26 353464 Bkp JobId [349658], ClntId [0], iDATypeId [0], AppId [438]
4944 86c 11/30 03:32:26 353464 Log. Size [2769008028], Off. [4923352907951]
4944 86c 11/30 03:32:26 353464 Phy Size [2548587826], Off. [4139429171089]
4944 86c 11/30 03:32:26 353464 Chnk No. [965], Last Chnk No. [965]
4944 86c 11/30 03:32:26 353464 Enc. [0] SI [1] CV SI [1] CV SI WEB [0] Flg [23] XFlg [0]
4944 86c 11/30 03:32:26 353464 Enc. Type [0], Key Len [0]
4944 86c 11/30 03:32:26 353464 ReadStartoff - Phy. [4139429171089], Log. [4923352907951]
4944 86c 11/30 03:32:26 353464 %% Use Regular AUX COPY %%
4944 86c 11/30 03:32:26 353464 1193605-1161067 [DM_BASE ] CloseChunkSpecific: [Total Physical bytes read - 0]
4944 86c 11/30 03:32:26 353464 1193605-1161067 [MEDIAFS ] CloseSpecific: Delayed Flush Queue Size [0], Busy Queue [0], Avg Flush Wait [0.000000] Sec(s), Samples [0]
4944 86c 11/30 03:32:26 353464 [DM_CHUNK ] Got new Chunk Info. ChunkId [6989719], CommcellId [2], CopyId[15], VolumeId [287104], FileNumber [1929], NumberOfArchFiles [1]
4944 86c 11/30 03:32:26 353464 1193605-1161067 [DM_BASE ] Opening the Chunk =6989719, ArchFileId = 663022, FileMarker=1929, ArchFilePhysSizeInChunk=2548587826 VolId=287104
4944 86c 11/30 03:32:26 353464 1193605-1161067 [MEDIAFS ] ChunkMetaDataIndex File [C:\CV_Mounts\BTR_CV_DISKLIB_VG01_VOL01\CV_Backups\CV_MAGNETIC\V_287104\CHUNK_6989719\CHUNK_META_DATA_6989719.idx] Version [3]
4944 86c 11/30 03:32:26 353464 1193605-1161067 [MEDIAFS ] SingleInstancer_FS Opened [C:\CV_Mounts\BTR_CV_DISKLIB_VG01_VOL01\CV_Backups\CV_MAGNETIC\V_287104\CHUNK_6989719\CHUNK_META_DATA_6989719] file
4944 86c 11/30 03:32:26 353464 Successfully opened the archive files on the media...going to read data.
4944 86c 11/30 03:32:26 353464 Entering [CvAuxCopy::SyncDedupDB]
4944 86c 11/30 03:32:26 353464 1193605-1161067 [DM_BASE ] Initialized DDB Link Reader [C:\CV_Mounts\BTR_CV_DISKLIB_VG01_VOL01\CV_Backups\CV_MAGNETIC\V_287104\CHUNK_6989719\CHUNK_META_DATA_6989719].
4944 86c 11/30 03:32:27 353464 1193605-1161067 [DM_BASE ] DataMoverBase::GetDedupDBReconstructorLinks(): 0 bytes read from meta data file. EOF.
4944 86c 11/30 03:32:27 353464 Reader [3] <Copy/Stream> Source <15/3>: Reporting SUCCESS to AuxcpyMgr, Err [0/0]. Chnk [6989719], bytes copied [2548587826]
4944 86c 11/30 03:32:27 353464 Reader [3] <Copy/Stream> Source <15/3>: Going to wait on the queue semaphore.
4944 1e88 11/30 03:32:27 353464 Received FREE STREAM Request for readerId [3]
4944 1e88 11/30 03:32:27 353464 Reader [3] <Copy/Stream> Source <15/3>: Going to stop the auxcopy thread...
4944 86c 11/30 03:32:27 353464 Reader [3] <Copy/Stream> Source <15/3>: AUXCOPY_REQUEST_TYPE_END_STREAM found in the command queue.
4944 86c 11/30 03:32:27 353464 Going to close the archive files on the media
4944 86c 11/30 03:32:27 353464 1193605-1161067 [DM_BASE ] Going to close chunk for Aux Copy
4944 86c 11/30 03:32:27 353464 1193605-1161067 [DM_BASE ] CloseChunkSpecific: [Total Physical bytes read - 0]
4944 86c 11/30 03:32:27 353464 1193605-1161067 [MEDIAFS ] CloseSpecific: Delayed Flush Queue Size [0], Busy Queue [0], Avg Flush Wait [0.000000] Sec(s), Samples [0]
4944 86c 11/30 03:32:27 353464 Going to close the archive file copies
4944 86c 11/30 03:32:27 353464 Going to bring down the auxcopy pipeline
4944 86c 11/30 03:32:27 353464 stat- ID [SI Link Insert Times], Job Id [353464], Samples [46133093], Time [101796.424675] Sec(s), Average [0.002207] Sec/Sample
4944 86c 11/30 03:32:27 353464 stat- ID [SI Link Read Times], Job Id [353464], Samples [46133093], Time [192.472686] Sec(s), Average [0.000004] Sec/Sample
4944 86c 11/30 03:32:27 353464 stat- ID [Media Read Speed], Job Id [353464], Bytes [14024460272], Time [192.452641] Sec(s), Average Speed [69.496410] MB/Sec
4944 86c 11/30 03:32:27 353464 stat- ID [Next chunk recv times], Job Id [353464], Samples [189], Time [10.307392] Sec(s), Average [0.054536] Sec/Sample
4944 86c 11/30 03:32:27 353464 stat- ID [Media Open Times], Job Id [353464], Samples [1861], Time [384.107960] Sec(s), Average [0.206399] Sec/Sample
4944 1e88 11/30 03:32:30 353464 Reader [3] <Copy/Stream> Source <15/3>: Reporting FREE_STREAM to AuxcpyMgr, Err [0/0]. Chnk [6989719], bytes copied [0]
4944 1e88 11/30 03:32:30 353464 Going to close the media handle of the source media
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [DM_BASE ] Going to close chunk for Aux Copy
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [DM_BASE ] CloseChunkSpecific: [Total Physical bytes read - 0]
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [MEDIAFS ] CloseSpecific: Delayed Flush Queue Size [0], Busy Queue [0], Avg Flush Wait [0.000000] Sec(s), Samples [0]
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [MEDIAFS ] Free Flush Queue Size [0]
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [MEDIAFS ] Data file cache queue size [1]
4944 1e88 11/30 03:32:30 353464 1193605-1161067 [DM_BASE ] Releasing the volume and drive reservation on the media manager
4944 1e0c 11/30 03:32:39 353464 1193605-# [DM_BASE ] ReadAhead: Received exit command.
4944 1e0c 11/30 03:32:39 353464 1193605-# [DM_BASE ] ReadAhead: LookAhead thread exiting.
4944 1e88 11/30 03:32:39 353464 1193605-# [DM_BASE ] Destroyed DataMoverBase for MediaGroupId=[856]
4944 1e88 11/30 03:32:39 353464 Received FREE STREAM Request for readerId [3]
4944 1e88 11/30 03:32:39 353464 Received DDB SYNC STATUS request
As a side question to this, will it hurt to temporarily pause deduplication so we get backups? We only have 10 servers still actively backed up in here. Most are now in Azure. We currently have 6 partitions each with around 28TB free on them so I am not too worried about space.
Agreed, nothing stands out here. Next log to check then is SIDBEngline.log. You should also be able to check the status of each Partition via the GUI (refresh first). I also wonder if you have a transactional DDB (which can’t be reconstructed and needs to be converted), If it is, SIDBEgnline.log will show MemDB as true.
Regarding your question about running jobs sans dedupe, you certainly can. They’ll be large, but you need backups, so definitely do what you need to do.
All 6 partitions show as ready (I attached a pic). The SIDBEngine.Log only appears to give me the last hour. I went through all the ones I had (4:21PM ET is the first I see). None of them showed the word true but perhaps I missed something. I inherited this from someone that is no longer with us so I am trying to learn as I go.
This community is a great place to be if you’re new to the product. Lots of great resources and peers willing to share!
Appreciate the log file. It’s extremely long and only covers a short time period, so it’s possible that either the error is in a previous SUDBEngline.log (_1, _2, etc.) or it’s already rolled over.
At this point, I would suggest opening an incident as I do suspect you have a Transactional DDB (I do see some MemDB references, though I’m not 100% on the application to your situation without further inspection) and we can determine more through checking the database entries and more.
Could you open a support case and share the number with me so I can track and follow up accordingly?
Hi
Thanks!
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.