Skip to main content
Question

Single stream performace tuning

  • February 17, 2025
  • 6 replies
  • 109 views

Forum|alt.badge.img+7

Hello,

I do backup restore testing on 11.32 version and found that speed of single stream is quite limited. For SAP for Oracle DB agent I’m getting around 700-900 GB/h per stream, for VMware VSA agent around 700 GB/h per stream.
At the same time backup server hardware resources are not very busy, hardware is quite powerful, DDB is on NVMe drives, Library is on SSD drives. Restore destination hardware resources also are not heavy loaded. So I see that hardware and network is not the bottleneck.

Tested on Pools with 128KB and 256KB DDB blocks, with media encryption enabled AES128

Are there any software limits for stream speed implement programmatically by Commvault? If yes, then can I control this limit manually using some advanced setting?



Br,
Andrejs

6 replies

Rajiv
Vaulter
Forum|alt.badge.img+12
  • Vaulter
  • 323 replies
  • February 17, 2025

Hello ​@AndresL What type of transport mode you are using to restore? Can you check for stat- ID parameter in vsrst.log on the access node and check what takes more time? Also, you can use testvminfo tool as documented here to measure the performance outside of Commvault. 

How to optimize the performance of Virtual Server Agent data protection

Best,

Rajiv Singal


Forum|alt.badge.img+7
  • Author
  • Byte
  • 39 replies
  • February 17, 2025
Rajiv wrote:

Hello ​@AndresL What type of transport mode you are using to restore? Can you check for stat- ID parameter in vsrst.log on the access node and check what takes more time? Also, you can use testvminfo tool as documented here to measure the performance outside of Commvault. 

How to optimize the performance of Virtual Server Agent data protection

Best,

Rajiv Singal

Hello ​@Rajiv ,

I use HotAdd for restore, synthetic test with testvminfo showing mush higher speeds, depending on vmdk disk type it showing 2200-4000 GB/h. Definitely problem is not related to destination.
vsrst.log:

4792  1f00  02/17 14:27:18 1869265 stat- ID [writedisk], Bytes [642901868544], Time [2153.856571] Sec(s), Average Speed [284.661016] MB/Sec

4792  1f00  02/17 14:27:20 1869265 stat- ID [readmedia], Bytes [470008183486], Time [1983.386133] Sec(s), Average Speed [225.994689] MB/Sec

4792  1f00  02/17 14:27:21 1869265 VSRstArchive::UpdateJobProgress() - Sending job stream update message for stream [1]

4792  1bb4  02/17 14:27:28 1869265 VSRstCoordinator::UpdateVMStatusToJobMgr() - Updating JobManager VM Status for 1 VMs

4792  1bb4  02/17 14:27:28 1869265 vsJobMgr::updateVMBkpJobStatus() - Sending VM status for [1] virtual machines

4792  15b0  02/17 14:27:30 1869265 VSRstArchiveReader::SendSeekMessage() - Waiting for Low watermark - [38719] queued

4792  15b0  02/17 14:28:00 1869265 VSRstArchiveReader::CheckSeekMessageQueue() - Generated Low Watermark - [31458] queued

4792  15b0  02/17 14:28:00 1869265 VSRstArchiveReader::SendSeekMessage() - Resuming seek messages

4792  15b0  02/17 14:28:00 1869265 stat- ID [ProcessSeekMsgs], Samples [581], Time [2090.565967] Sec(s), Average [3.598220] Sec/Sample

4792  15b0  02/17 14:28:00 1869265 stat- ID [ReceiveSeekMsgs], Samples [583], Time [104.798280] Sec(s), Average [0.179757] Sec/Sample

4792  15b0  02/17 14:28:00 1869265 VSRstArchiveReader::CheckSeekMessageQueue() - Generated High Watermark - [131073] queued


It looks like problem is not related to specific agent type, probably data flow from MA is not good enough.


Rajiv
Vaulter
Forum|alt.badge.img+12
  • Vaulter
  • 323 replies
  • February 18, 2025

Hello ​@AndresL Please make sure there is no AV interference on MA. If it's a Windows machine, I would suggest use procmon and analyze if any AV or defender is scanning CommVault processes. Antivirus Exclusions for Windows

Best,

Rajiv Singal

 


Onno van den Berg
Commvault Certified Expert
Forum|alt.badge.img+19
  • Commvault Certified Expert
  • 1232 replies
  • February 19, 2025

@Rajiv what makes you believe it is AV that is impacting the performance? What other possibilities are there to quickly find a possible root cause. What is the reference from your side in term of real performance? What kind of performance do you see in the lab? 


Forum|alt.badge.img+7
  • Author
  • Byte
  • 39 replies
  • February 21, 2025

Hello guys,

Sorry for delay, did more testing and updated Commvault to latest maintenance release, we were quite a lot behind. After updates I noticed that SAP Oracle restore speed improved from ~800 GB/h per stream to ~1200 GB/h per stream (256KB dedup. block). For VSA agent I didn't noticed improvement, it remained the approximately the same 270-300 MB/s or around 1000 GB/h. 
Antivirus is disabled during testing.
Although 1200 GB/per stream looks better it still doesn’t produce noticeable load on Media agent hardware, we still are not using all potential. I’m sure hardware is capable produce much more.
Also I made interesting and strange observation: if I run in same Storage Pool also Auxiliary Copy job during Backup Restore process then restore speed improves and could grow up to 1500 GB/h per stream, that is weird.(That was observed also before updates.)  


Damian Andre
Vaulter
Forum|alt.badge.img+23
  • Vaulter
  • 1297 replies
  • February 21, 2025

Its difficult to compare application types in terms of restore speed. Are you getting similar deduplication ratios between Oracle and VM backups?

Dedupe ratio will have some impact on restore performance. Higher deduplication ratio could result in more fragmentation of blocks across the disk, and less chance of contiguous data. Whereas a lower deduplication ratio is likely to have multiple blocks stored in a contiguous fashion. The impact of that should be less for SSD given that random access is not a major issue, but there could still be latency traversing to different SSDs to retrieve blocks of data.

Hotadd also introduces an extra hop in the data path vs a direct oracle restore. If you have the possibility of leveraging NBD for a direct MA to hypervisor test, that might be an interesting experiment to test the performance, assuming your network will allow it.

Also double check the CPU load on the HotAdd proxy during restore, that is a pretty common bottleneck and its not uncommon to see the CPU maxed out, especially if hosted on a congested ESXi host.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings