Question

NDMP - NetApp backup randomly failing


Badge +5

Hello all! Hope to find you in a good shape!

In our environment, we are struggling with a problem of NDMP backups on NetApp devices.

*It was working just fine before, then just started to act like described below.

 

We do have a couple of subclients/paths configured, and they are randomly failing. No matter if it is Full or Incremental backup, no matter the which backup in a row. Sometimes same subclient might complete few backup in a row, but another one fails. There is no patter, random subclients with a random data amount. We cant find any pattern here.

That is basically the error message:

"Error Code: [39:501] Description: Client [XXXXXX] was unable to connect to the tape server [XXXXXXX] on IP(s) [10.100.XXX.XX, 10.100….., 10.100….., 10.100….., 169.254…..] port [50666]. Please check the network connectivity. Source: [MediaAgentXXX], Process: NasBackup "

 

What have been checked:

-Those machines are in the same VLAN - no routing no FW, ports are open.

-No errors found on the Switch Ports after monitoring them for some time

-Media agents restarted few times

-OS patched

-In Data Paths changed default media agents - did not help


On NDMP logs, we have found that:


But like I said, not errors on Switches found.

Interface pairs are in place - never been changed.

There is something in commvault logs related to interface pairs/network problems? But I am not sure if it is even related.

Problem to get interface pair? But few rows later, the connection is established anyway? (lof from failed backup)


Commvault case is already processed - but we are not having best experience…

Maybe someone is having some similar experience and know a solution?

CV version is 11.32.45, and we are afraid that it started to fail after an upgrade.


2 replies

Userlevel 1
Badge +8

Don’t have experience with this exact scenario, but when you are certain you have tried everything that makes sense, start looking at stuff that might not make sense. 

A switch involved? Try another port and/or cable.
Other services running on the same commvault client? Check for any events happening on that service at the same time as the backup attempt.
Did commvault support ask you to adjust the logging/debug level? Could provide more detailed information about what exactly is going on.

Userlevel 5
Badge +14

Hello @Grzegorz 

We see this error in two different scenarios.

  1. The Filer has multiple interfaces and we are trying to connect to the wrong one
  2. Ports are blocked between the Filer and the MediaAgent

We establish the Management connection to the Filer over port 10000 but then data transfers occur on random high ports. If you want, you can configure a range of ports for us to use and then define them in the Media Agent properties.

 

https://documentation.commvault.com/11.24/expert/19663_configuring_firewall_between_file_server_and_mediaagent.htmland

https://documentation.commvault.com/11.24/expert/7141_setting_up_data_interface_pair_between_two_clients.html

 

Thank you,
Collin

Reply