This is my 6th or 7th RA deployment and first time I have run into this issue. I ran through the process up setting up the 25G bonded networks (one for DP and one for storage) got the system into the commserve and then went to have it update to the latest version.
cvupgradeos.py ran fine for hedvig updating to 4.7.13, but when I went to upgrade from 11.24 which was pre-loaded on the nodes the first server succeeded, but the second hung. It didn’t give me any feed back. I opened a case and have been waiting to get any feedback.
I am unable to ssh between nodes now and occasionally can using the DP network. I verified that there is nothing wrong with the network stack. Ping, netstat all is good. The services are up and listening. Firewalls are disabled.
I ran a nc -vvv (node name) 22 and it connects but hangs.
Connection is not refused, nor terminated at the other end. Commvault on node one shows updated to the latest version but will not communicate with the commserve now.
Support says not to reimage and I am in the middle of the deployment in a limited time before I leave so I need to get answers.
Has anyone experienced anything like this, or have any suggestions on further troubleshooting?