Skip to main content
Solved

NDMP Backup Performance very poor with a Netapp CIFS Volume with millions of Files

  • 29 January 2021
  • 6 replies
  • 2360 views

Hi,

are there any Best Practices to Backup Netapp Volumes with lots of files in a single Volume (use cases cannot use multiple volumes)?

 

Two of my Customers running in a similar issues with the Performance (1st Customer has a Throughput of around 10 GB/h and the other around 150 GB/h). On both Customers we are Using Intellisnap and after we create a Backup Copy to a Disklib.

 

Here is the CVPerfLog of a NDMP Job:

2748  1858  01/27 12:05:16 45098 *CVPERFLOG*|13|273978|38|1611742376|1611738776|1611745516|1611741916|1611742376|1611738776|1611745516|1611741916|49|1|*CVCOUNTER*|550|19|1611741916|3139|0|0|0|0|*|7479872512|0|*CVCOUNTER*|551|550|1611741916|3120|0|0|0|0|percent:99.41|0|0|*CVCOUNTER*|552|551|1611741916|3114|0|0|0|0|percent:99.23|0|0|*CVCOUNTER*|553|551|1611741916|6|0|0|0|0|percent:0.19|0|0|*CVCOUNTER*|505|550|1611741916|18|0|0|0|0|percent:0.58|0|0|*CVCOUNTER*|503|505|1611741916|4|0|0|0|0|percent:0.16|0|0|*CVCOUNTER*|502|505|1611741916|12|0|0|0|0|percent:0.39|0|0|*CVCOUNTER*|1004|505|1611741916|3135|0|0|0|0|*|0|0|*CVCOUNTER*|1001|505|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|2001|505|1611741916|766|0|568683|0|0|*|7557386768|0|*CVCOUNTER*|2002|505|1611741916|150|0|568682|0|0|*|7557329232|0|*CVCOUNTER*|2003|505|1611741916|32|0|568682|0|0|*|4561799024|0|*CVCOUNTER*|2053|2003|1611741916|11|0|0|0|0|*|4440366347|0|*CVCOUNTER*|3001|2003|1611741916|20|0|0|0|0|*|0|0|*CVCOUNTER*|3006|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3007|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3003|3001|1611741916|14|0|0|0|0|*|0|0|*CVCOUNTER*|3002|3003|1611741916|13|0|43733|0|0|*|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|400|3003|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signature Processed|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Signatures|36912|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signatures Found in DDB|6821|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Application Data size|4794095616|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Processed Data size|4236506352|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Data size|3801076660|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Dropped Data size (percentc10.28])|435429692|0|*CVCOUNTER*|10002|400|0|0|0|0|0|0|Src-dedupe signature count|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Non-dedupable data size|308200952|0|*CVCOUNTER*|2008|505|1611741916|3|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2006|505|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2056|508|1611741916|3137|0|318676|0|0|*|4129252216|0|*CVCOUNTER*|2057|508|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|5002|2057|0|0|0|0|0|0|*|0|0|*CVCOUNTER*|5005|2057|0|10|0|0|0|0|*|4115470926|0|*CVCOUNTER*|6010|5005|1611741916|0|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6014|5005|1611741916|2|0|0|0|0|*|4120279180|0|

2748  1858  01/27 12:05:16 45098

|*273978*|*Perf*|45098| =======================================================================================

|*273978*|*Perf*|45098| Job-ID: 45098            ÂPipe-ID: 273978]            3App-Type: 13]             Data-Type: 1]

|*273978*|*Perf*|45098| Stream Source:   bbksw16k-219

|*273978*|*Perf*|45098| Network medium:   SDT

|*273978*|*Perf*|45098| Head duration (Local):  927,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| Tail duration (Local):  227,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|     Perf-Counter                                                                     Time(seconds)              Size

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| NDMP Remote Server

|*273978*|*Perf*|45098|  |_NDMP Data Receiver..............................................................      3139                7479872512  P6.97 GB] 87.99 GBPH]

|*273978*|*Perf*|45098|    |_Reader Data Socket.percent:99.41].............................................      3120                         

|*273978*|*Perf*|45098|      |_Data Server Wait Timeopercent:99.23]........................................      3114                         

|*273978*|*Perf*|45098|      |_Data Server Read Timeepercent:0.19].........................................         6                         

|*273978*|*Perf*|45098|    |_Reader Pipeline ModulesÂpercent:0.58].........................................        18                         

|*273978*|*Perf*|45098|      |_Pipeline write percent:0.16]................................................         4                         

|*273978*|*Perf*|45098|      |_Buffer allocation.percent:0.39].............................................        12                         

|*273978*|*Perf*|45098|      |_CVA Wait to received data from reader.......................................      3135                         

|*273978*|*Perf*|45098|      |_CVA Buffer allocation.......................................................         -                         

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768  e7.04 GB]  .Samples - 568683] .Avg - 0.001347] Â33.08 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Compression.......................................................       150                7557329232  .7.04 GB]  .Samples - 568682] Avg - 0.000264]  168.92 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Signature module..................................................        32                4561799024   4.25 GB]  ÂSamples - 568682]  Avg - 0.000056]  477.96 GBPH]

|*273978*|*Perf*|45098|        |_SDT-Head: Signature Compute...............................................        11                4440366347  .4.14 GB] .1353.41 GBPH]

|*273978*|*Perf*|45098|        |_Src-side Dedup............................................................        20                         

|*273978*|*Perf*|45098|          |_Buffer allocation.......................................................         1                         

|*273978*|*Perf*|45098|          |_Passing to next module..................................................         1                         

|*273978*|*Perf*|45098|          |_Sig-lookup..............................................................        14                         

|*273978*|*Perf*|45098|            |_SIDB-Lookup...........................................................        13                           

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsert.bbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsertgbbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|            |_Source Side Dedupe stats..............................................         -                         

|*273978*|*Perf*|45098|              |_ Signature Processed]...............................................         -                     43733  o42.71 KB]

|*273978*|*Perf*|45098|              |_.New Signatures]....................................................         -                     36912  .36.05 KB]

|*273978*|*Perf*|45098|              |_ÂSignatures Found in DDB]...........................................         -                      6821   6.66 KB]

|*273978*|*Perf*|45098|              |_8Application Data size].............................................         -                4794095616  .4.46 GB]

|*273978*|*Perf*|45098|              |_ÂProcessed Data size]...............................................         -                4236506352  .3.95 GB]

|*273978*|*Perf*|45098|              |_ New Data size].....................................................         -                3801076660  Â3.54 GB]

|*273978*|*Perf*|45098|              |_.Dropped Data size (percent 10.28])]................................         -                 435429692  3415.26 MB]

|*273978*|*Perf*|45098|              |_aSrc-dedupe signature count]........................................         -                     43733   42.71 KB]

|*273978*|*Perf*|45098|              |_0Non-dedupable data size]...........................................         -                 308200952   293.92 MB]

|*273978*|*Perf*|45098|      |_SDT-Head: CRC32 update......................................................         3                4129194680  .3.85 GB]  .Samples - 318675] .Avg - 0.000009]  4614.73 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  .3.85 GB]  .Samples - 318675]  Avg - 0.000038] Â1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  e3.85 GB]  ÂSamples - 318676]  Avg - 0.009844] d4.41 GBPH]

|*273978*|*Perf*|45098|  |_SDT-Tail: Writer Tasks..........................................................        12                4129194680  83.85 GB]   Samples - 318675] _Avg - 0.000038] a1153.68 GBPH]

|*273978*|*Perf*|45098|    |_DSBackup: Update Restart Info.................................................         -                         

|*273978*|*Perf*|45098|    |_DSBackup: Media Write.........................................................        10                4115470926  23.83 GB] 21379.82 GBPH]

|*273978*|*Perf*|45098|      |_SIDB:CommitAndUpdateRecs.bbksw16k-219]......................................         -                         

|*273978*|*Perf*|45098|      |_Writer: DM: Physical Write..................................................         2                4120279180  e3.84 GB] .6907.16 GBPH]

|*273978*|*Perf*|45098|

 

If I see it right the NDMP Server is always waiting for Data but I am not completly sure. I am thankful for a little help in this case. 

 

Kind Regards

Florian

Hi Florian

Thanks for the question, please take a look at this page for configuring multiple streams:
https://documentation.commvault.com/commvault/v11_sp20/article?p=19631.htm
 

We would need to check CVNDMPRemoteServer.log for NDMP performance, but best bet beyond that is to raise a support case and we can take a detailed look and then advise on how we may be able to improve performance.

Thanks,

Stuart


Hi Florian,

Antivirus on netapp CIFS will be worth checking.

Also, in schedule → advanced options, you can adjust some NAS options

 


Hi Florian,

 

The problem here is the data coming from NDMP(NetApp).

 

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768   7.04 GB]  ÂSamples - 568683] 7Avg - 0.001347] e33.08 GBPH]

 

You would need to check the NDMP DUMP speed on the NetApp for the volume you are backing up via CV.


Hi

What you are seeing is common in extremely dense volumes packed with millions of small files.  By default NMDP is a oriented at the volume so the volume contents are streamed by the NAS/NDMP service and that can be the bottleneck.  Use a snapshot to offload the NDMP dump is generally helpful in improving the speed, but when dealing with dense file volumes you may need to try another option.

 

Please take a look at this newer feature we added for NetApp C-mode that was designed for cases like yours.  Under the covers we take the policy and break it into some smaller subclients / to filter the volume file structure so we can separate it into different backup calls to effectively multiple concurrent NDMP jobs ( each aligned to a group) against the same volume.  It produces multiple streams from the same logical volume to increase the overall performance.  

Brock

 

Perform a Multi-Streaming Backup Within a Content Path

For NDMP NetApp C-mode clients, you can allow multiple data readers to back up an individual content path on a subclient. The new multi-streaming support on individual content paths works in conjunction with the existing multi-streaming support for backing up multiple content paths.

Multi-streaming within individual content paths improves the backup performance of large volumes.

For more information, see Configuring Multiple Streams for Backups.


@shailu89 This option is not usable for me because we ware using the NDMP Agent and not the NAS Agent.

 

@Brock We are using IntelliSnap and so we are using Snapshots on the Storage System. 

 

Last Friday I checked the whole Logs again and saw that the various Dumpphases are taking a lot of time but the Transfer itself is also slow.

I checked the Option for Multiple Stream Backups but I guess this did not work at all. Here is a quick view of the Perfanalysis Log:

Total Data Write: 2872362622959 22675.10 GB] 5735.76 GBPH]
 Stream Count: 1
 
 
 Remediation(s): 
 --------------
 
 Stream 1:
 Source: bbksw16k-219
 Destination: bbksw16k-219:BBKSW16K-219.kirchheim.bickhardt-bau.de
 
 

----------------------------------
| READS FROM THE SOURCE ARE SLOW |
----------------------------------
    - Increase the number of data readers from the subclient. Suggested values are 8, 12.
    - Change Application/Read size from the subclient. Suggested values are 512KB,1MB for FS. Refer documentation for Oracle, SQL, VSA.
    - Run CVDiskPerf tool on the source to verify the Disk Performance.

DOCUMENTATION
-------------
http://documentation.commvault.com/commvault/v11/article?p=8580.htm
http://documentation.commvault.com/commvault/v11/article?p=8596.htm
http://documentation.commvault.com/commvault/v11/article?p=8855_1.htm

CONSIDERATION(S)
----------------
Increasing streams to a high value may cause disk thrashing and also use more system resources. 
Changing read app size will cause re-baseline. So increase the value gradually.

 

I will check the recent Log Files during the day and give you a Feedback.


Hi Florian,

 

Yes, network is a contributing factor as MA is receiving data at 4.4 GBPH but source is sending at 1+ TBPH.

 

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  Â3.85 GB]   Samples - 318675] [Avg - 0.000038] l1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules|MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  Â3.85 GB]  ÂSamples - 318676]  Avg - 0.009844] 24.41 GBPH]


Reply