Skip to main content

Hello community,
Please help me solve the following problem.
I am using Commvault (11.24.25) to back up SAP HANA databases.
The architecture of the backup system is as follows:
CommServe - VMware virtual machine connected to network 192.168.22.0/24
simpana-ma01 - physical server Windows Storage Server 2016 Standard, connected to network 192.168.22.0/24
simpana-ma02 - physical server Windows Storage Server 2012 R2 Standard, connected to network 192.168.22.0/24
clients - VMware virtual machines with SLES for SAP 15 SP3 operating system, on which SAP HANA 2.0.054 is installed, are connected to the network 192.168.22.0/24
Of the features:
- use dedicated subnet for backup 192.168.22.0/24
- virtual machines are connected to vSwitch, which has 4 physical 10G interfaces. The balancing mode is Route based on IP hash.
- media agents have two 10G network interfaces, which are combined into NIC Teaming. Teaming Mode - LACP, Load balancing mode - Dynamic
- no firewalls between clients and media agents are used
Problem:
Periodically, backup jobs for SAP HANA transaction logs are completed after a little over 2 minutes after being started with an error:
"Unable to communicate with the remote machine [simpana-ma02] to start the Data Pipe. Please check the network connectivity between the local machine and the remote machine and verify this product's Communications Service is running on the remote machine, Error [Connect to 192.168.22.26:8405 failed: Connection timed out]."
This error occurs with all media agents. The following error is in the client logs:
"9674 25e4 03/09 11:51:09 871631 ERROR: CvFwClient::connect(): Connect to 192.168.22.26:8405 failed: Connection timed out
9674 25e4 03/09 11:51:09 871631 CPipelayer::connectToDest Failed to connect to simpana-ma02(simpana-ma02):8405/8405: Connect to 192.168.22.26:8405 failed: Connection timed out
9674 25e4 03/09 11:51:09 871631 CPipelayer::InitiatePipeline Cannot connect to the CVD port on machine [simpana-ma02]:[8405]
9674 25e4 03/09 11:51:09 871631 CCVAPipelayer::StartPipeline() - Failed to initiate pipeline
9674 25e4 03/09 11:51:09 871631 CVArchive::StartPipeline() - Startup of DataPipe failed
9674 25e4 03/09 11:51:09 871631 ClDBControlAgent::OnMsgInitPipe() - Setup pipeline      failed
9674 25e4 03/09 11:51:09 871631 ClDBControlAgent::OnMsgInitPipe() - sending response FAIL to agent process
9674 25e4 03/09 11:51:09 871631 ClDBControlAgent::OnMsgInitPipe() - INITPIPE_RESP sent"

The task is abnormally closed, a new one is opened, which is successfully completed. This issue occurs on all clients, randomly and with both SystemDB and TenantDB.
And exactly 2 minutes after the start of job. Monitoring does not reveal any errors. Please advise what can be checked and how to diagnose this problem.

Hey @Roman Kalyadin,

Thanks for the detailed information!

Can you try the following changes.

#1 - On the network properties of the media agent, exclude 8400 and 8403 as an additional port. Reduce the range from something like 8404-8424 -  In fact, with any network topology / network config additional ports are generally not used, but adding the CVD port of 8400 as an additional port allows data transfer traffic to bypass the firewall (and inherently any throttling or encryption), so lets remove that variable.

#2 - On the network properties of the media agent, in the incoming connections tab set the HANA group to “Blocked”.

This way, The HANA client(s) will always initiate maintain an active (persistent) network connection towards the Media Agent. If there is a drop, it will automatically attempt a reconnect which can help insulate you from strange network conditions that could be causing disconnects or failure to make the initial connection.

 

 

 


Thanks for the answer. I made the suggested changes, but the error still appears. 

An excerpt from the log:

25588 63f9 03/10 16:11:34 872316 ERROR: CvFwClient::connect(): Connect to 192.168.22.26:8405 failed: Connection timed out
25588 63f9 03/10 16:11:34 872316 CPipelayer::connectToDest Failed to connect to simpana-ma02(simpana-ma02):8405/8405: Connect to 192.168.22.26:8405 failed: Connection timed out
25588 63f9 03/10 16:11:34 872316 CPipelayer::InitiatePipeline Cannot connect to the CVD port on machine simpana-ma02]::8405]
25588 63f9 03/10 16:11:34 872316 CCVAPipelayer::StartPipeline() - Failed to initiate pipeline
25588 63f9 03/10 16:11:34 872316 CVArchive::StartPipeline() - Startup of DataPipe failed
25588 63f9 03/10 16:11:34 872316 ClDBControlAgent::OnMsgInitPipe() - Setup pipeline     failed
25588 63f9 03/10 16:11:34 872316 ClDBControlAgent::OnMsgInitPipe() - sending response FAIL to agent process

Any other ideas? I put up a test bench and tried to reproduce the problem, but it didn't work for me.

Can a large number of SAP HANA clients affect this? I have over 30 of them, each with at least SystemDB and one or more TenantDB.
 


@Roman Kalyadin , one thing you could try on your end first is to telnet from the client to the MA on port 8405 (the port where it timed out) and see if you can connect and stay connected for an extended period.  It very well could be an issue with the network and this would go a long way towards giving you that proof.

Have you shared this issue with your network team?

If it connects fine and remains there, I would open a support case and share the incident number here so I can track it.


Very interesting its trying to connect on port 8405 - is the Media Agent multi-instanced? 8405 is used as a CVD port if 8400 is in use on the local machine.

Just in case, you may want to remove all additional ports as a test - can you confirm on the ‘control’ tab of the properties of the Media Agent that “Optimize for concurrent LAN backups” is enabled or disabled?


@Roman Kalyadin , Can you run the following command on the mediaagent(simpana-ma02) and share the output please?

netstat -aof | findstr :8405

netstat -aof | findstr :8400


@Roman Kalyadin , one thing you could try on your end first is to telnet from the client to the MA on port 8405 (the port where it timed out) and see if you can connect and stay connected for an extended period.  It very well could be an issue with the network and this would go a long way towards giving you that proof.

Have you shared this issue with your network team?

If it connects fine and remains there, I would open a support case and share the incident number here so I can track it.

I successfully connect to media agents via telnet, the connection is not interrupted.

Very interesting its trying to connect on port 8405 - is the Media Agent multi-instanced? 8405 is used as a CVD port if 8400 is in use on the local machine.

Just in case, you may want to remove all additional ports as a test - can you confirm on the ‘control’ tab of the properties of the Media Agent that “Optimize for concurrent LAN backups” is enabled or disabled?

simpana-ma02 - it’s a media agent with multi instance enabled. Previously, CommServe was installed here, which was moved to a separate virtual machine. The problem manifested itself even before it was done. Also, the problem manifests itself with another media agent simpana-ma01, on which multi-instance is not used.

The "Optimize for concurrent LAN backups" option is enabled on both media agents.

@Roman Kalyadin , Can you run the following command on the mediaagent(simpana-ma02) and share the output please?

netstat -aof | findstr :8405

netstat -aof | findstr :8400

The output of the command "netstat -aof | findstr :8405"

 TCP    0.0.0.0:8405           0.0.0.0:0              LISTENING       2172
 TCP    192.168.22.26:8405     192.168.22.28:51512    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.28:51522    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.28:51524    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.28:56488    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.28:56492    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.28:58864    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.29:19660    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.29:19662    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.29:24656    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.29:24658    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.36:56012    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.36:56022    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.36:61088    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.36:61090    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.40:56112    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.40:56122    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.40:56142    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.40:56144    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.47:29688    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.47:29690    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.47:64710    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.47:64712    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:19440    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:51512    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:56694    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:56696    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:56712    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.48:56722    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.69:10986    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.69:10988    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.69:17858    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.69:17860    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.70:40404    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.70:40406    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.70:51512    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.70:51522    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:21688    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:21690    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:26926    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:26928    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:26930    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:26932    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:27740    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.71:27742    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.76:62090    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.76:62092    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.76:62840    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.76:62842    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.87:22896    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.87:22898    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.87:24686    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.87:24688    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.90:51512    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.90:51522    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.90:51524    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.90:51526    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.92:52328    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.92:52332    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.92:53942    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.92:53944    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.99:40404    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.99:40406    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.99:44264    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.99:44266    ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:27662   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:27664   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:27666   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:27668   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:42398   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:42404   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:52636   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.100:52638   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.104:51512   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.104:51522   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.104:61642   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.104:61644   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.111:40404   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.111:40412   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.111:61354   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.111:61356   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.113:13624   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.113:13626   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.113:51512   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.113:51522   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.114:53412   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.114:53422   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.114:61432   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.114:61434   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.115:40404   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.115:40406   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.115:41970   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.115:41972   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51512   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51522   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51524   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51526   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51528   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51532   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51534   ESTABLISHED     2172
 TCP    192.168.22.26:8405     192.168.22.135:51538   ESTABLISHED     2172
 
 

The output of the "netstat -aof | findstr :8400" command is empty on simpana-ma02


@Mahender Reddy I took a network dump with WireShark on the last error. Clearly there is some kind of network problem. But I'm not sure if this is causing the error.

 

 


@Roman Kalyadin , were you able to review the Wireshark output with your network team?

Thanks!