Hello everybody,
I am posting here because the help from Commvault so far on this subject has been poor. We are ESM customers so pay extra for support, but my account has so far not been able to give me the answers that I need.
About 4 years ago we had an HSX cluster installed into our environment. This was part of a global deployment, so I assuming that the install was carried out by a Commvault professional that would have performed all the necessary tests before installing the OS, like checking that the OS disks were mirrored. Nobody from Commvault has been able to confirm who did the installation. The install does not appear to have done correctly, and what I have here are three HPE DL380 Gen10 12LFF nodes with the OS installed on a Linux LVM volume that spans across two 480GB SSDs. This is the same on all three HSX nodes. This has never actually caused us any issues, but was only picked up recently whilst trying to perform the RHEL to Rocky automated conversion, which got as far as the Rocky install only for that to fail when it found the OS was spanning two disks.
Clearly this issue needs to be resolved, as we are running nodes on unsupported disks without the extra protections that come with RAID. Unfortunately support are telling me that the OS disks must be installed on a hardware RAID controller, despite these servers only having one hardware RAID controller, and is used exclusively for the 12 data drives. The OS disks are attached to an Embedded SATA Controller that can be configured RAID 1, but only with a software RAID controller (S100i). I cannot see anything in the documentation that confirms this - the only functional requirement that I have found in the documentation is that the OS disks need to be mirrored (Setting Up the Hardware for HyperScale X Reference Architecture).
As our server model is an end of sale hardware platform now, I cannot find documentation that proves we are on a supported architecture. Commvault sold us these very HSX nodes via HPE however, so I am fairly confident we were sold what was deemed to be a supported reference architecture at the time and they are just handling this really badly.
I know that I am not going to be able to convert the lvm volume to RAID 1 and somehow save the OS install. My plan to try and move forward with these upgrades is basically this:
Rebuild HSX cluster using S100i RAID1 for OS disk (480GB)
- Evacuate node 1
- Decommission node cleanly from hedvig cluster following documentation
- Create RAID 1 logical drive from the two SATA SSDs (they will be initialised) on S100i
- Install OS using Commvault RHEL 7.9 ISO on to the new RAID 1 volume
- Install HSX components
- Matches existing cluster version
- Rejoin hedvig cluster
- Validate
- Complete Rocky migration process that was failing before now that OS disk is installed correctly
If this is successful with node 1, then repeat for nodes 2 and 3.
Does this seem sensible? Effectively I have to rebuild the cluster, correctly…. but Commvault will not approve this course of action because “the OS must be installed on a hardware RAID controller” - can somebody please link me to where it says this in the docs?
Happy to provide more information here if required if I haven’t explained anything in enough detail.
Thanks.

