Question

Deploying new Hyperscale X RA cluster fails to upgrade due to sshd service

  • 15 November 2023
  • 1 reply
  • 52 views

Badge

This is my 6th or 7th RA deployment and first time I have run into this issue. I ran through the process up setting up the 25G bonded networks (one for DP and one for storage) got the system into the commserve and then went to have it update to the latest version. 

cvupgradeos.py ran fine for hedvig updating to 4.7.13, but when I went to upgrade from 11.24 which was pre-loaded on the nodes the first server succeeded, but the second hung. It didn’t give me any feed back. I opened a case and have been waiting to get any feedback. 

 

I am unable to ssh between nodes now and occasionally can using the DP network. I verified that there is nothing wrong with the network stack. Ping, netstat all is good. The services are up and listening. Firewalls are disabled. 

I ran a nc -vvv (node name) 22 and it connects but hangs. 

Connection is not refused, nor terminated at the other end. Commvault on node one shows updated to the latest version but will not communicate with the commserve now. 

Support says not to reimage and I am in the middle of the deployment in a limited time before I leave so I need to get answers. 

Has anyone experienced anything like this, or have any suggestions on further troubleshooting? 


1 reply

Userlevel 2
Badge +5

Hi Z,

Thanks for reaching out on this question.

I was across your case and have seen it has been resolved.

In this instance the ssh was failing due to an MTU mismatch and once we forced the Interfaces to use 1500 we were able to ssh and run through the upgrade.

If you have any further questions please feel free to reach out.

Kind regards,

Scott Henderson
Media Management - Technical Team Lead (APAC)

Reply