Solved

Hyper-V in pause mode stop or failes back-up with Intellisnap


Userlevel 2
Badge +4

If you have one node in pause mode inside a hyper-v cluster and start a backup on a clustershared volume with Intellisnap enabled the job failed because of the node that is in pause mode.

Is there an option or solution to work around this?

icon

Best answer by HenkR 15 July 2021, 10:52

View original

14 replies

Userlevel 6
Badge +14

Hi @HenkR ,

What error/failures are you seeing specifically here?

In the contents of the subclient, have you entered only the CSV Volume?
Is this CSV owned on another Hyper-V Node in the cluster?

Do all the HV Nodes have the VirtualServer Agent installed?
Are all of the HV Nodes (with VSA Installed) listed in the Hypervisors/Proxies Tab of the Cluster’s Virtualisation Client? (check subclient properties too).

 

Regards,

Michael 
 

Userlevel 2
Badge +4

Hi,

 

Yes, Content selection is the actual cluster shared volume (disk)

CHyperVinfo2:: ConnectHyperVSDKForCluster() - Cluster Node [XXXXXHOST] is paused or down.

CHyperVInfo2:: - Could not find HyperV SDK for host [XXXXXHOST] key [XXXXXHOST]

 

Error Code : [72:106] Description : Could not establish connection host [XXXXXHOST.

HOST] results may be incomplete. > Proces vsdiscovery.

 

MA and VSA are installed on every node on the cluster.

Start the server in operational state and back-up works fine.
 

Otherwise the MEDIA agent (MAGLIB] and mount server must be in a domain or backup also it does not work at all.

 

Userlevel 6
Badge +14

Thanks @HenkR , 

 

What’s the Hyper-V/OS version you’re working with here?
So the Job is failing since it cannot communicate to that node to discover the machines/configs from that cluster node.

If you remove the offline HV Node from the Virtualisation Client’s Nodes Tab in the properties, then run another Job. - Does it complete?

 

Regards,

Michael

Userlevel 2
Badge +4

Hi, 

Thanks for getting back on this matter.

The Hyper-V Cluster is running on Windows 2019 Datacenter Edition.

I tried to remove the node there but the pseudo client scan's the cluster and discover the node anyhow and tries to connect/discover to it and job also fails.

But in any circumstance if a node is in a “maintenance” this should not be an issue. And CommVault should skip this server anyhow in my opinion.

 

Regards,

 

Henk
 

 

Userlevel 6
Badge +14

If the Node is in “Maintenance” and all roles have been evacuated, then skipping it would totally make sense.

@Mike Struening - I think this one might be worth a Support Case, unless the team can find anything?

 

Regards,

Michael

Userlevel 7
Badge +23

@HenkR I agree with @MichaelCapon ,  This looks like a candidate for a Support Incident.  Appreciate his wise advice and efforts (as always….the guy is a huge help!!) though at this point things are not working as we should expect.

Can you send me the case number so I can follow up accordingly?

Thanks!

Userlevel 2
Badge +4

Hi,

The incident number is 210611-260

 

With Regards,

 

Henk

Userlevel 7
Badge +23

Thanks, @HenkR !  I’ll keep an eye on it.

Userlevel 7
Badge +23

Thanks, @HenkR !  I unmarked the ‘Best Answer’ since I want to be sure we capture the eventual/actual solution for posterity :nerd:

Userlevel 2
Badge +4

Hi,

After A while with support we found out that, if the HDS integration was not working correctly, the job fails with a node in pause mode. The failure seems to be on the Hyper-V side (the Job stops almost immediately)  but actually it's on the integration with Hitachi that causing the issue. 


Now the integration is working the the JOB gives a CWE error and now skips the host by default.

Henk.

Userlevel 7
Badge +23

Appreciate the detailed update @HenkR !!!

Userlevel 4
Badge +12

Have just seen something similar

There’s no Intellisnap involved

If a single Node is paused (within a HV Cluster), the jobs fail

Jobs are fine - if that ‘paused’ node is removed

Userlevel 7
Badge +23

@YYZ I would go right to a support case.  We were pretty much out of ideas here though the issue was snapshot related.

If you are not using intellisnap, then we are back to 0 ideas 🤓

Userlevel 4
Badge +12

Hi Mike

That's not a problem.

I encountered the issue yesterday - and searched the Community 

I was stumped as to why it wouldnt ‘work around’ the ‘Pause’

 

For now - just being aware of the issue is enough.

 

Reply