Solved

NDMP Backup Performance very poor with a Netapp CIFS Volume with millions of Files

  • 29 January 2021
  • 6 replies
  • 2285 views

Userlevel 2
Badge +6

Hi,

are there any Best Practices to Backup Netapp Volumes with lots of files in a single Volume (use cases cannot use multiple volumes)?

 

Two of my Customers running in a similar issues with the Performance (1st Customer has a Throughput of around 10 GB/h and the other around 150 GB/h). On both Customers we are Using Intellisnap and after we create a Backup Copy to a Disklib.

 

Here is the CVPerfLog of a NDMP Job:

2748  1858  01/27 12:05:16 45098 *CVPERFLOG*|13|273978|38|1611742376|1611738776|1611745516|1611741916|1611742376|1611738776|1611745516|1611741916|49|1|*CVCOUNTER*|550|19|1611741916|3139|0|0|0|0|*|7479872512|0|*CVCOUNTER*|551|550|1611741916|3120|0|0|0|0|percent:99.41|0|0|*CVCOUNTER*|552|551|1611741916|3114|0|0|0|0|percent:99.23|0|0|*CVCOUNTER*|553|551|1611741916|6|0|0|0|0|percent:0.19|0|0|*CVCOUNTER*|505|550|1611741916|18|0|0|0|0|percent:0.58|0|0|*CVCOUNTER*|503|505|1611741916|4|0|0|0|0|percent:0.16|0|0|*CVCOUNTER*|502|505|1611741916|12|0|0|0|0|percent:0.39|0|0|*CVCOUNTER*|1004|505|1611741916|3135|0|0|0|0|*|0|0|*CVCOUNTER*|1001|505|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|2001|505|1611741916|766|0|568683|0|0|*|7557386768|0|*CVCOUNTER*|2002|505|1611741916|150|0|568682|0|0|*|7557329232|0|*CVCOUNTER*|2003|505|1611741916|32|0|568682|0|0|*|4561799024|0|*CVCOUNTER*|2053|2003|1611741916|11|0|0|0|0|*|4440366347|0|*CVCOUNTER*|3001|2003|1611741916|20|0|0|0|0|*|0|0|*CVCOUNTER*|3006|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3007|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3003|3001|1611741916|14|0|0|0|0|*|0|0|*CVCOUNTER*|3002|3003|1611741916|13|0|43733|0|0|*|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|400|3003|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signature Processed|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Signatures|36912|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signatures Found in DDB|6821|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Application Data size|4794095616|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Processed Data size|4236506352|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Data size|3801076660|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Dropped Data size (percent[10.28])|435429692|0|*CVCOUNTER*|10002|400|0|0|0|0|0|0|Src-dedupe signature count|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Non-dedupable data size|308200952|0|*CVCOUNTER*|2008|505|1611741916|3|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2006|505|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2056|508|1611741916|3137|0|318676|0|0|*|4129252216|0|*CVCOUNTER*|2057|508|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|5002|2057|0|0|0|0|0|0|*|0|0|*CVCOUNTER*|5005|2057|0|10|0|0|0|0|*|4115470926|0|*CVCOUNTER*|6010|5005|1611741916|0|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6014|5005|1611741916|2|0|0|0|0|*|4120279180|0|

2748  1858  01/27 12:05:16 45098

|*273978*|*Perf*|45098| =======================================================================================

|*273978*|*Perf*|45098| Job-ID: 45098            [Pipe-ID: 273978]            [App-Type: 13]            [Data-Type: 1]

|*273978*|*Perf*|45098| Stream Source:   bbksw16k-219

|*273978*|*Perf*|45098| Network medium:   SDT

|*273978*|*Perf*|45098| Head duration (Local):  [27,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| Tail duration (Local):  [27,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|     Perf-Counter                                                                     Time(seconds)              Size

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| NDMP Remote Server

|*273978*|*Perf*|45098|  |_NDMP Data Receiver..............................................................      3139                7479872512  [6.97 GB] [7.99 GBPH]

|*273978*|*Perf*|45098|    |_Reader Data Socket[percent:99.41].............................................      3120                         

|*273978*|*Perf*|45098|      |_Data Server Wait Time[percent:99.23]........................................      3114                         

|*273978*|*Perf*|45098|      |_Data Server Read Time[percent:0.19].........................................         6                         

|*273978*|*Perf*|45098|    |_Reader Pipeline Modules[percent:0.58].........................................        18                         

|*273978*|*Perf*|45098|      |_Pipeline write[percent:0.16]................................................         4                         

|*273978*|*Perf*|45098|      |_Buffer allocation[percent:0.39].............................................        12                         

|*273978*|*Perf*|45098|      |_CVA Wait to received data from reader.......................................      3135                         

|*273978*|*Perf*|45098|      |_CVA Buffer allocation.......................................................         -                         

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768  [7.04 GB]  [Samples - 568683] [Avg - 0.001347] [33.08 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Compression.......................................................       150                7557329232  [7.04 GB]  [Samples - 568682] [Avg - 0.000264] [168.92 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Signature module..................................................        32                4561799024  [4.25 GB]  [Samples - 568682] [Avg - 0.000056] [477.96 GBPH]

|*273978*|*Perf*|45098|        |_SDT-Head: Signature Compute...............................................        11                4440366347  [4.14 GB] [1353.41 GBPH]

|*273978*|*Perf*|45098|        |_Src-side Dedup............................................................        20                         

|*273978*|*Perf*|45098|          |_Buffer allocation.......................................................         1                         

|*273978*|*Perf*|45098|          |_Passing to next module..................................................         1                         

|*273978*|*Perf*|45098|          |_Sig-lookup..............................................................        14                         

|*273978*|*Perf*|45098|            |_SIDB-Lookup...........................................................        13                            [Samples - 43733] [Avg - 0.000297]

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsert[bbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsert[bbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|            |_Source Side Dedupe stats..............................................         -                         

|*273978*|*Perf*|45098|              |_[Signature Processed]...............................................         -                     43733  [42.71 KB]

|*273978*|*Perf*|45098|              |_[New Signatures]....................................................         -                     36912  [36.05 KB]

|*273978*|*Perf*|45098|              |_[Signatures Found in DDB]...........................................         -                      6821  [6.66 KB]

|*273978*|*Perf*|45098|              |_[Application Data size].............................................         -                4794095616  [4.46 GB]

|*273978*|*Perf*|45098|              |_[Processed Data size]...............................................         -                4236506352  [3.95 GB]

|*273978*|*Perf*|45098|              |_[New Data size].....................................................         -                3801076660  [3.54 GB]

|*273978*|*Perf*|45098|              |_[Dropped Data size (percent[10.28])]................................         -                 435429692  [415.26 MB]

|*273978*|*Perf*|45098|              |_[Src-dedupe signature count]........................................         -                     43733  [42.71 KB]

|*273978*|*Perf*|45098|              |_[Non-dedupable data size]...........................................         -                 308200952  [293.92 MB]

|*273978*|*Perf*|45098|      |_SDT-Head: CRC32 update......................................................         3                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000009] [4614.73 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules[MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  [3.85 GB]  [Samples - 318676] [Avg - 0.009844] [4.41 GBPH]

|*273978*|*Perf*|45098|  |_SDT-Tail: Writer Tasks..........................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|    |_DSBackup: Update Restart Info.................................................         -                         

|*273978*|*Perf*|45098|    |_DSBackup: Media Write.........................................................        10                4115470926  [3.83 GB] [1379.82 GBPH]

|*273978*|*Perf*|45098|      |_SIDB:CommitAndUpdateRecs[bbksw16k-219]......................................         -                         

|*273978*|*Perf*|45098|      |_Writer: DM: Physical Write..................................................         2                4120279180  [3.84 GB] [6907.16 GBPH]

|*273978*|*Perf*|45098|

 

If I see it right the NDMP Server is always waiting for Data but I am not completly sure. I am thankful for a little help in this case. 

 

Kind Regards

Florian

icon

Best answer by R Anwar 1 February 2021, 10:02

View original

6 replies

Userlevel 7
Badge +15

Hi Florian

Thanks for the question, please take a look at this page for configuring multiple streams:
https://documentation.commvault.com/commvault/v11_sp20/article?p=19631.htm
 

We would need to check CVNDMPRemoteServer.log for NDMP performance, but best bet beyond that is to raise a support case and we can take a detailed look and then advise on how we may be able to improve performance.

Thanks,

Stuart

Userlevel 1
Badge +5

Hi Florian,

Antivirus on netapp CIFS will be worth checking.

Also, in schedule → advanced options, you can adjust some NAS options

 

Userlevel 4
Badge +8

Hi Florian,

 

The problem here is the data coming from NDMP(NetApp).

 

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768  [7.04 GB]  [Samples - 568683] [Avg - 0.001347] [33.08 GBPH]

 

You would need to check the NDMP DUMP speed on the NetApp for the volume you are backing up via CV.

Userlevel 2
Badge +2

Hi

What you are seeing is common in extremely dense volumes packed with millions of small files.  By default NMDP is a oriented at the volume so the volume contents are streamed by the NAS/NDMP service and that can be the bottleneck.  Use a snapshot to offload the NDMP dump is generally helpful in improving the speed, but when dealing with dense file volumes you may need to try another option.

 

Please take a look at this newer feature we added for NetApp C-mode that was designed for cases like yours.  Under the covers we take the policy and break it into some smaller subclients / to filter the volume file structure so we can separate it into different backup calls to effectively multiple concurrent NDMP jobs ( each aligned to a group) against the same volume.  It produces multiple streams from the same logical volume to increase the overall performance.  

Brock

 

Perform a Multi-Streaming Backup Within a Content Path

For NDMP NetApp C-mode clients, you can allow multiple data readers to back up an individual content path on a subclient. The new multi-streaming support on individual content paths works in conjunction with the existing multi-streaming support for backing up multiple content paths.

Multi-streaming within individual content paths improves the backup performance of large volumes.

For more information, see Configuring Multiple Streams for Backups.

Userlevel 2
Badge +6

@shailu89 This option is not usable for me because we ware using the NDMP Agent and not the NAS Agent.

 

@Brock We are using IntelliSnap and so we are using Snapshots on the Storage System. 

 

Last Friday I checked the whole Logs again and saw that the various Dumpphases are taking a lot of time but the Transfer itself is also slow.

I checked the Option for Multiple Stream Backups but I guess this did not work at all. Here is a quick view of the Perfanalysis Log:

Total Data Write: 2872362622959 [2675.10 GB] [5735.76 GBPH]
 Stream Count: 1
 
 
 Remediation(s): 
 --------------
 
 Stream 1:
 Source: bbksw16k-219
 Destination: bbksw16k-219:BBKSW16K-219.kirchheim.bickhardt-bau.de
 
 

----------------------------------
| READS FROM THE SOURCE ARE SLOW |
----------------------------------
    - Increase the number of data readers from the subclient. Suggested values are 8, 12.
    - Change Application/Read size from the subclient. Suggested values are 512KB,1MB for FS. Refer documentation for Oracle, SQL, VSA.
    - Run CVDiskPerf tool on the source to verify the Disk Performance.

DOCUMENTATION
-------------
http://documentation.commvault.com/commvault/v11/article?p=8580.htm
http://documentation.commvault.com/commvault/v11/article?p=8596.htm
http://documentation.commvault.com/commvault/v11/article?p=8855_1.htm

CONSIDERATION(S)
----------------
Increasing streams to a high value may cause disk thrashing and also use more system resources. 
Changing read app size will cause re-baseline. So increase the value gradually.

 

I will check the recent Log Files during the day and give you a Feedback.

Userlevel 4
Badge +8

Hi Florian,

 

Yes, network is a contributing factor as MA is receiving data at 4.4 GBPH but source is sending at 1+ TBPH.

 

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules[MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  [3.85 GB]  [Samples - 318676] [Avg - 0.009844] [4.41 GBPH]

Reply