Sorry to hear you’ve been having difficulty deploying your HyperScale X, however it sounds like this particular configuration with multiple gateways may not be possible with the current installation tool. The installation tool does ensure that the network configuration matches what is configured in the UI, and will remove any additional configuration in place (this is intentional to ensure consistent configuration across nodes).
I would suggest deploying the system with only the default gateway configured on the management interface, which should allow the installation to complete. Once the installation is done, you can add the policy routing rules, or static routes to the Management/Data Protection networks. Whenever making network changes we suggest stopping HSX services to ensure the cluster doesn’t go down ungracefully in the event that network connectivity is lost between nodes. See documentation for details on how to safely stop services.
For reference, HSX does not support defining multiple default gateways in the ifcfg files, as this will result in only one gateway being used. Per RHEL documentation: “The ifcfg
files are parsed in numerically ascending order, and the last GATEWAY directive to be read is used to compose a default route in the routing table” (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-configuring_the_default_gateway.). Static routes or policy based routing is recommended in these cases. I’ll get this information added to documentation, in the near future.
Let me know if that helps!
Ok @Justin Wolf , I’ve now deployed the block via GUI only with DP and SP - no additional Mgmt IP in dp network.
Next I added the same routes/rules that I used before.
VLAN 400 = Mgmt
VLAN 180 = DP / Commserve
- add an additional IP in DP network that should only be used for Mgmt later where sshd/httpd will be bound to - nor now not routed
/opt/commvault/MediaAgent/cvnwlacpbond.py -c -m 802.3ad -t dp -i 10.24.242.161 -n 255.255.254.0 -v 400 -nr
- add new routing tables
File /etc/iproute2/rt_tables
...
1 vlan180
2 vlan400
- add new routes for both IP’s
File /etc/sysconfig/network-scripts/route-bond1.180
10.24.56.0/23 dev bond1.180 src 10.24.56.221 table vlan180
default via 10.24.56.1 dev bond1.180 table vlan180
File /etc/sysconfig/network-scripts/route-bond1.400
10.24.242.0/23 dev bond1.400 src 10.24.242.161 table vlan400
default via 10.24.242.1 dev bond1.400 table vlan400
- add new routing rules
File /etc/sysconfig/network-scripts/rule-bond1.180
from 10.24.56.221/32 table vlan180
to 10.24.56.221/32 table vlan180
File /etc/sysconfig/network-scripts/rule-bond1.400
from 10.24.242.161/32 table vlan400
to 10.24.242.161/32 table vlan400
- restart networking
systemctl restart network
This works as before, I can reach both IP’s. Next I would bind the sshd/http(s) daemons to the IP in Mgmt VLAN 400.
My biggest concern is, how this will behave during updates. Will the same happen as before when I started the GUI installer and the node is suddenly without networking?
And which services can I safely bind to the firewalled Mgmt IP? sshd and httpd? Is there any communication between nodes or Commserver on those ports? I see that on http the HSX install wizard is still running and on https Hedwig. Can I limit both ports to be reachable by firewall? Or is there cluster communication going on (on SP network...)?
Even after binding sshd etc to the Mgmt IP there are lot of ports open and unprotected, are they all needed? iscsi?
PORT STATE SERVICE
111/tcp open rpcbind
2181/tcp open eforward
3260/tcp open iscsi
4321/tcp open rwhois
5666/tcp open nrpe
7003/tcp open afs3-vlserver
7004/tcp open afs3-kaserver
8080/tcp open http-proxy
8400/tcp open cvd
8403/tcp open admind
8405/tcp open svbackup
8750/tcp open dey-keyneg
8777/tcp open unknown
8778/tcp open uec
8800/tcp open sunwebadmin
8801/tcp open unknown
8803/tcp open unknown
8804/tcp open truecm
8805/tcp open pfcp
8806/tcp open unknown
8808/tcp open ssports-bcast
9091/tcp open xmltec-xmlmail
33333/tcp open dgi-serv
42863/tcp open unknown
44859/tcp open unknown
50000/tcp open ibm-db2
50002/tcp open iiimsf
50012/tcp open unknown
Not working, @Justin Wolf . Seems that the cluster is communicating via ssh after install and limiting sshd to the mgmt IP breaks this communication as the sshd is not listening on DP IP anymore.
hv_deploy> show_all_clusters
Cluster name: HV11052022051641 - owner: unowned - version: v-4.4.0.0.2.3564.2484803ffb:a3cd82e8b6f6bbb821df1f0a32045461
hv_deploy> login_to_cluster HV11052022051641
Enter the SSH password for cluster 'HV11052022051641' - user 'root': ****************
1. sdes1603-dp.xx --> connection refused (report: /tmp/pr_rpt.1)
2. sdes1601-dp.xx --> connection refused (report: /tmp/pr_rpt.2)
3. sdes1602-dp.xx --> connection refused (report: /tmp/pr_rpt.3)
/etc/hosts entries are also wrong.
droot@sdes1601-dp network-scripts]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.24.176.7 sdes1603-dp.xx sdes1603-dp
10.24.176.6 sdes1602-dp.xx sdes1602-dp
10.24.176.5 sdes1601-dp.xxt sdes1601-dp
10.24.176.xxx are the IP’S in the SP network with DNS entries like sdes1601-sp not -np.
I got it working with some help from support. sshd must also running on the IP in storage pool network, the hosts files is by purpose “wrong” so that the sp IP’s are used. I find it still strange but at least it works.