Skip to main content
Question

VMware SAN transfer backups about 25% slower after 11.36 upgrade

  • January 27, 2025
  • 6 replies
  • 102 views

Forum|alt.badge.img+7

About 3 weeks ago we upgraded from 11.28 to 11.36.   Since then our VMware SAN mode (Fiber Channel) backups on average are 25% slower.    Has anyone else run into this?
   I can see the slow down in the job run times and throughput, as well a looking at my Fiber Channel SAN ports.  Here is a screen shot of the SAN port performance for the last year.  The ports in the chart are only used to pull the VMware backups from our ESX environment.   The spikes are the full backups each weekend.   We can see the last 3 weekends have been much slower than the trend was before that.   Was there changes to the VDDK or some other related piece of the backup?

 

Forum|alt.badge.img+7

Hi ​@Farmer92 ,

Good day!

Need VDDKs are added in SP36, From vsbkp.log verify what VDDK is being used now to make sure its not the issue.

Also check if there is any Version upgrade on VC/ESXI end

From old and new jobs logs or history we need to verify which part is the bottleneck where is are spending most time.

Regards,

Sureshkumar S


Forum|alt.badge.img+7
  • Byte
  • January 29, 2025

Sureshkumar,

   Thank you.  I did look and the logs VixDiskLib shows VDDK702 before and after the 11.36 upgrade.

I am trying to get support to help find out what changed between 11.28 and 11.36 that may slow things down.

   I also see that at least one of my AUX copies, from our main datacenter to a remote data center also slowed down since upgrading to 11.36 on Jan 9th.   The blue transfers will not transmit as much as they did prior.

   Something is not quite right with 11.36

 


Forum|alt.badge.img

I am seeing the same issues with the upgrade from 11.32 to 11.36 and have an open ticket with CommVault 250317-573. They are recommending I check for bottlenecks in the network which we have ruled out as being the issue. We were also given a manual backup via command line to isolate the backup from software but I feel like all the same settings would be used. C:\Program Files\Commvault\ContentStore\Base> .\TestVMInfo.exe -vmware -server 'your vcenter fqdn' -user 'username' -password 'password' -vmname 'YourVM' -desthost 'host fqdn' -destDatastore 'datastore name'  -transport SAN -disksize 73 -readdisks But this has not given us any additional information


Forum|alt.badge.img

Results came back and I ran it using both SAN and NBD and NBD completed but SAN is still running so I feel like it is tied to backups running over SAN mode. Did anyone get this resolved?


Forum|alt.badge.img+2
  • Vaulter
  • March 19, 2025

Hi Greg,

Kindly compare the results as below :

  1. Determine the Jobid of the backup performed via SAN mode and look into vsbkp.log , VixDiskLib.log  on the VSA proxy and CVPerfmgr.log, CvPerfLogAnalyze.log on the MA 
  2. Determine the Jobid of the backup performed via NBD mode look into vsbkp.log , VixDiskLib.log  on the VSA proxy and CVPerfmgr.log, CvPerfLogAnalyze.log on the MA 

Comparison needs to be performed for both the scenario’s in order to determine and find out the bottleneck . CvPerfLogAnalyze.log will provide a summary of the performance and provide some insights about the bottleneck . These results will need to be looked at . As you have also mentioned about Auxcopy performance issue , MA side will need to be looked at (CVPerfmgr.log, CvPerfLogAnalyze.log)

Following link provides the high level overview of the changes with regards to Virtualization in general 

Changes in Commvault Platform Release 2024E  

From the Fabric/HBA stats , there have been slow performance recorded during May & July  time frame as well (Though the performance stats are stable and optimal from Mid-July to Early Jan time frame), hence upgrade to 11.36 may not be the cause of the issue . 

If this needs a further diagnosis , would be best to log a support ticket with the observations and results recorded as per the recommendations above. 

 

Thanks & Regards 


Forum|alt.badge.img+7

Hi Greg,

I still have an open support ticket regarding the issue. So far, they suspect it may be related to DDB performance. Since we upgraded to 11.36, the Q&I time has increased, but they haven’t been able to pinpoint any specific reason for the slowdown after the update.

The upgrade was the only change in our environment at the time. While we do have a very large partitioned DDB, and they believe we might be nearing a threshold, it’s unclear why the performance would degrade so significantly right after the upgrade.

The investigation continues.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings