Question

Windows VSA Proxy bluescreen using Hotadd

  • 13 October 2023
  • 7 replies
  • 68 views

Badge +3

Hello

Recently our Windows VSA proxies started to bluescreen during backup when we are using HotAdd as transport Mode. Switching to NBD works fine, but is a lot slower.

Happens in 2 different ESX cluster running both vSphere 7.3 and 8

The server BSOD with message

HYPERVISOR_ERROR (20001)
The hypervisor has encountered a fatal error.
Arguments:
Arg1: 0000000000000013
Arg2: 0000000000000000
Arg3: 0000000029b92701
Arg4: ffffe800005492c0
 

Anyone else experienced recent issues with HotAdd backups?

 

 


7 replies

Userlevel 7
Badge +19

Any idea if automount was disabled on the VSA?

See also → https://documentation.commvault.com/2023e/expert/32048_hotadd_transport_for_vmware.html#best-practices-for-hotadd

Diskpart
automount
Automount disable
automount scrub
Userlevel 4
Badge +11

Hello @Gnaget1891 

Can you please confirm if automount is disabled? 

diskpart> automount
diskpart> automount disable
diskpart> automount scrub

Also, please confirm if the SCSI ports are in order: 

https://documentation.commvault.com/2023e/expert/32048_hotadd_transport_for_vmware.html

To end with, please check if all the drivers are upto date on the proxy server.

Best,

Rajiv Singal

Userlevel 6
Badge +14

Hi @Gnaget1891 ,

Finally try : https://documentation.commvault.com/2023e/expert/32048_hotadd_transport_for_vmware.html#testing-hotadd

Testing HotAdd

You can use the following process to test HotAdd attachments for a VM:

  1. Create a snapshot on the VM that you planning to back up.

  2. Attach the base disk from that VM to the VSA proxy as an independent or non-persistent disk.

  3. If the attach is successful, the VSA proxy will be able to perform a HotAdd operation.

  4. Detach the disk from the VSA proxy.

  5. Remove the snapshot for the VM.

If you can reproduce the issue outside Commvault then you can log a case with Vmware.

Best Regards,

Sebastien

Badge +3

Automount is disabled. Well, it wasn’t from the beginning but enabling it did not help

This used to work fine, but started to appear now in October. I guess some Windows update is causing this, or maybe ESX, but I get the same error on 2 different clusters with different versions of ESX

 

I will open a support ticket for this, and also try to HotAdd manually

 

Thanks

 

Userlevel 7
Badge +19

@Gnaget1891 can you reproduce it easily? Would it be possible to revert the Windows updates or reprovision a brand new acces node without the latest updates? 

Any other update that has been done like AV ?

Badge +3

Hello

Not really easy to deploy a Windows server without required updates. One thing I have noticed is that it looks like it was specific VM’s that caused the crash, However suddenly I was able to do a HotAdd of one of them, so still no clear picture of what is causing this.

I think I will look at using a Linux based access node instead, initial tests are promising

Userlevel 7
Badge +19

Ok. Not sure if you looked into using FREL, but we use it for a long time already and the nice thing is that you can deploy them directly from Command Center.

 

Reply