Solved

Disaster Recovery Backups have started failing

  • 24 September 2022
  • 15 replies
  • 197 views

Userlevel 4
Badge +13

My administrative Job Summary Report has started to show:

ERROR CODE [34:53]: CommServeDR: Destination Directory [\\<sever_216>\D$\DR_Dump_Prod] does not exist or is inaccessible
Source: <server_57>, Process: commserveDR

When I check <server_216>, I see there’s lots of disk space and that the D:\DR_Dump_Prod folder exists and has several folders named SET_99999 with the most recent folder being 2 days old.  

Does anyone know how to check what’s gone wrong?

Ken

 

icon

Best answer by Ken_H 13 October 2022, 19:39

View original

15 replies

Userlevel 6
Badge +16

Hi @Ken_H 

I would first manually open a connection to the UNC path from <server_57> and authenticate with the user configured in the DR backup location config.

If that works, re-enter credentials in the DR backup location config and test the DR backup.

If the manual test doesn't work, check if the firewall is active on <server_216> and allows incoming smb traffic, secondly check if the user configured in the DR backup location config has rights on the share.

Userlevel 4
Badge +13

These DR backups have been running longer than I’ve been working with CommVault and I’m not sure where they are configured.  The CommVault online documentation talks about using the Command Center to make these changes (as shown here) but when I try to follow those directions I get “Something went wrong. Server may be under maintenance. Trace ID: cvoDRcf1NY” . 

Can you tell me where to find this configuration in the Java GUI?

Ken

 

Userlevel 6
Badge +16

In the top section you can open the control panel, here you will find a button DR backup

Userlevel 4
Badge +13

Thanks for the help @Jos Meijer.

I have a CommServe Hardening PDF from CommVault about hardening the security to keep backups safe in the event of a cyber attack.  One of the recommendations is:

  • Disable NETBIOS on the CommServe host. To do this, open the Network and Sharing Center, select Change Adapter Settings, right-click the network connection, and select Properties. Select Internet Protocol Version 4 (TCP/IPv4) and click the Advanced button in the displayed dialog. Select the WINS tab and select the Disable NetBIOS over TCP/IP.

Oops.  Turns out disabling NetBIOS breaks the DR Backup.  I’ve enabled NetBIOS, rebooted, and the DR backups are back to normal.

Ken

Userlevel 6
Badge +16

Good to hear you have found the cause👍
Not so good to hear that this limits hardening 😐

Userlevel 4
Badge +11

Hi @Ken_H 

Just to let you know, I do Commserve hardening too in multiple deployments, I also do disable NETBIOS on Commserve host and don’t have any issues with DR backup on network share. Maybe it’s something environmental in your case but I definitely wouldn’t blame disabling NETBIOS alone.

Userlevel 4
Badge +13

Hi @Ken_H 

Just to let you know, I do Commserve hardening too in multiple deployments, I also do disable NETBIOS on Commserve host and don’t have any issues with DR backup on network share. Maybe it’s something environmental in your case but I definitely wouldn’t blame disabling NETBIOS alone.

Disabling NetBIOS broke DR backups and enabling it allowed backups to resume.  Any suggestions on how to get around this problem would be gratefully received.

Ken

Userlevel 7
Badge +23

@Ken_H , I unchecked the Best Answer.  Unless @Jos Meijer has some more ideas, it might be best to open a support case.

Userlevel 6
Badge +16

As NetBios brakes the UNC access you mostlikely are using a short name / single label name ( \\servername\D$ ) for the target server and not an FQDN address ( \\servername.dns.suffix\D$ ), is that correct?

If so, do you have a DNS server configured?

If so please try an FQDN address.

Let us know the outcome 🙂

Userlevel 4
Badge +11

Also as @Jos Meijer already mentioned check communication. Try to telnet <server_216> from server <server_57> on ports 139 and 445 and see if you could open these ports.

Userlevel 4
Badge +13

Sigh.  It turns out that CommVault will not just let you change the server name to the fully qualified domain name and gives the message:

Failed to set [Backup Metadata Folder] for [DR Backup] with error [Failed to save Disaster Recovery Destination in CommServe database because folder [\\<server_216.domain.suffix>\D$\DR_Dump_Prod] is already being used for Disaster Recovery.]

So I changed the configuration from:

   \\<server_216>\D$\DR_Dump_Prod

to 

   \\<server_216.domain.suffix>\D$\DR_Dump_PRD

Just because I’m paranoid, I’m going to let this run for the weekend before attempting to disable NetBIOS again.

Userlevel 6
Badge +16

My apologies, I should have forseen this.
As the folder is already in use and the location has been configured before the database will prevent such change.

Your action by assinging a different location is the only way.
Keep in mind that even after changing the DR location, aging on old sets (on unc path with short name) will be performed according to the retention settings. Thus when disabling NetBios will result in periodic errors that pruning on the old data path cannot be performed as the path will become inaccessible as experienced before.
This will eventually stop though when all sets, existing on the old location, have been attemted and failed to be removed. Then only aging/pruning will be continued on the new path.
You could lower the amount of retained DR sets before disabling NetBios to prune the sets on the old location and “force” an ealier transition this way. If this path is choosen then just monitor the old path, when fully emptied then disable NetBios and set the amount of DR sets back to the original setting.

Userlevel 4
Badge +13

Update:  I updated the DR backup to use a fully qualified domain name last Friday and left it to confirm that DR backups run normally.  After days of uneventful operation, yesterday (Wednesday) I disabled NetBIOS on the CommServer host and DR backups immediately began to fail.  I’ll enable NetBIOS now while investigating why this is so problematic.

Userlevel 4
Badge +13

Update:  Had a call with the local CommVault technical resource who found documentation that says:

  • Do not use administrative shares as an export location. For example, \\MyServer\E$\.

I’ve signed on to the <server_216> DR server > D:\DR_Dump_Prod > Properties > Sharing (tab) > Share > ensure the CommVault service account has full Read/Write privileges > click Share.

On the CommServer I did Start > Run > \\<server_216>.domain.suffix\DR_Dump_Prod (no “D$”) and confirmed access works as expected.  Ran the Commvault Console > Control Panel (from the ribbon) > DR Backup (in the Maintenance section) > and updated the destination from:

   \\<server_216>.domain.suffix\D$\DR_Dump_PRD

to

   \\<server_216>.domain.suffix\DR_Dump_Prod

Because previous changes wouldn’t allow updates to just the server name, I purposely selected the old destination so there would be a patch change.  

CommVault refused to accept the new path as there were already DR backups in that folder.  On <server_216> I created D:\DR_Dump_OldProd and moved the backups already in D:\DR_Dump_Prod to this new folder.  Once D:\DR_Dump_Prod was empty, the console accepted the new location.  Next I manually ran a DR backup by going to the CommVault console > CommCell Browser > Right-click on the root of the tree view > All Tasks > Disaster Recovery Backup > Full > OK.  The backup completed and I confirmed the files created on <server_216>.  

FINALLY, I disabled NetBIOS including stopping and disabling the TCP/IP NetBIOS Helper service.  I ran another DR backup and confirmed the files were written to <server_216> as expected.

In summary:  Configure a “real” share and do not rely on the Windows administrative shares.

Thanks everyone for your help.

Ken

Userlevel 6
Badge +16

Thanks for letting us know the solution @Ken_H 

Glad to hear it's fixed now 🙂

Reply