Solved

OK to kill long running DDBBackup job?

Forum|Forum|3 years ago
September 1, 2022
11 replies
873 views

+18

Ken_H
Challenger

I’m having problems with DDBBackup jobs at my DR site. I changed the configuration from every 6 hours to once per day but the backup from yesterday is still running and I’m at the 22 hour point. I’d like to kill this job and let a fresh one start but I understand that the DDBBackup uses snapshots and I’m afraid that if I kill it there won’t be a proper snapshot cleanup.

Is it OK to kill a DDBBackup job that’s been running almost a long time?

Ken

Best answer by Ken_H

When I checked the Dedupe DB rebuild job at 7:30 AM this morning, it was showing 68% complete after running for 20.5 hours. When I checked at 9:00 AM, the job had completed and both backups and Aux copy jobs had resumed. I’m guessing it will be 10 to 12 hours for the queued jobs to fully recover.

In the end, the long running DDB Backup job was really a symptom of a failure with the media agent hardware. This was tricky as not all the drive letters were impacted equally by the problem controller so it appeared other jobs were running OK… or at least well enough to not raise an error.

Thanks to everyone for their feedback on this topic.

The problem is resolved.

Ken

+24

Onno van den Berg
Community All Star
Forum|Forum|3 years ago
September 1, 2022

I personally have not seen any issue doing this in the past, so I expect Commvault to handle this properly. In case you do run into an issue than it's quite easy to get rid of the snapshot yourself. Of course it is pretty interesting to learn why it is "hanging” all of a sudden after making a change to the backup frequency only.

+20

Laurent
Challenger
Forum|Forum|3 years ago
September 2, 2022

I used to have such issues before, mostly because on my DR site, at the DDB backup time, multiple dash auxcopies were running, resulting in high DDB usage, while in parallel the volume hosting the DDB is beeing backup/snapped.

You can look at the DDBbackup job to try to troubleshoot and check what’s happening.

Depending on the OS of your MA(s), for windows OS you can use vssadmin commands, and Resource monitor to see what’s going on in terms of I/Os on the volume hosting your DDB.

What you should avoid to do, especially if you have an MA hosting multiple DDBs of different size, is to reboot/reset your MA during the ddbbackup job.

+18

Ken_H
Author
Challenger
Forum|Forum|3 years ago
September 6, 2022

I ended up killing the DDB Backup job and needing to use a Force Kill. I stopped all the aux copies and restarted the CommVault services then resumed the copies. I never checked the status of the snapshot so I assume it had completed. A subsequent DDB Backup completed successfully although it took 115 hours (4.8 days). I have a ticket open with CV support to look into why these backups are so problematic.

+27

Mike Struening
Vaulter
Forum|Forum|3 years ago
September 6, 2022

Glad that part worked out. Worth sharing the information you get regarding the backups. that way we have a nice, holistic thread.

https://www.linkedin.com/in/michael-struening

+18

Ken_H
Author
Challenger
Forum|Forum|3 years ago
September 13, 2022

It appears as if one of the controllers for the HPE MSA disks has failed. Today I was getting a disk response time of 680,692ms and a disk queue length of over 600. My sysadmin has forced communication through the other controller and my response time is now less than 40ms and the queue length is less than 1.

Unfortunately the Dedup DB Reconstruction job is failing so my backups still aren’t running. I have ticket 220912-556 open to get some help with this.

+27

Mike Struening
Vaulter
Forum|Forum|3 years ago
September 13, 2022

Thanks for the update, @Ken_H . I’ll keep an eye on it!

https://www.linkedin.com/in/michael-struening

+18

Ken_H
Author
Challenger
Forum|Forum|3 years ago
September 13, 2022

I’m doing the DDB reconstruction using the “Reconstruct entire DDB without using a previous recovery backup” option. After 3.1 hours it’s processed 56.4TB out of 841.1TB total. Based on these numbers, I’m estimating 42.5 hours remaining.

+27

Mike Struening
Vaulter
Forum|Forum|3 years ago
September 13, 2022

I had to check what day of the week today is, first 🤣

Hopefully on Thursday you have good news!

https://www.linkedin.com/in/michael-struening

+18

Ken_H
Author
Challenger
Answer
Forum|Forum|3 years ago
September 14, 2022

Thanks to everyone for their feedback on this topic.

The problem is resolved.

Ken

+27

Mike Struening
Vaulter
Forum|Forum|3 years ago
September 14, 2022

Glad to hear it, and happy you shaved a day off the ETA 🤣

https://www.linkedin.com/in/michael-struening

RyanOJD
Novice
Forum|Forum|3 years ago
September 14, 2022

Unfortunately the Dedup DB Reconstruction job is failing so my backups still aren’t running. I have ticket 220912-556 open to get some help with this.

@Ken_H How did you find the disk response time on your HPE MSA disk?

Readiverse Academy Certifications

Sign up

Login to the community