Solved

NDMP Backup Performance very poor with a Netapp CIFS Volume with millions of Files

  • 29 January 2021
  • 6 replies
  • 636 views

Userlevel 1
Badge +4

Hi,

are there any Best Practices to Backup Netapp Volumes with lots of files in a single Volume (use cases cannot use multiple volumes)?

 

Two of my Customers running in a similar issues with the Performance (1st Customer has a Throughput of around 10 GB/h and the other around 150 GB/h). On both Customers we are Using Intellisnap and after we create a Backup Copy to a Disklib.

 

Here is the CVPerfLog of a NDMP Job:

2748  1858  01/27 12:05:16 45098 *CVPERFLOG*|13|273978|38|1611742376|1611738776|1611745516|1611741916|1611742376|1611738776|1611745516|1611741916|49|1|*CVCOUNTER*|550|19|1611741916|3139|0|0|0|0|*|7479872512|0|*CVCOUNTER*|551|550|1611741916|3120|0|0|0|0|percent:99.41|0|0|*CVCOUNTER*|552|551|1611741916|3114|0|0|0|0|percent:99.23|0|0|*CVCOUNTER*|553|551|1611741916|6|0|0|0|0|percent:0.19|0|0|*CVCOUNTER*|505|550|1611741916|18|0|0|0|0|percent:0.58|0|0|*CVCOUNTER*|503|505|1611741916|4|0|0|0|0|percent:0.16|0|0|*CVCOUNTER*|502|505|1611741916|12|0|0|0|0|percent:0.39|0|0|*CVCOUNTER*|1004|505|1611741916|3135|0|0|0|0|*|0|0|*CVCOUNTER*|1001|505|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|2001|505|1611741916|766|0|568683|0|0|*|7557386768|0|*CVCOUNTER*|2002|505|1611741916|150|0|568682|0|0|*|7557329232|0|*CVCOUNTER*|2003|505|1611741916|32|0|568682|0|0|*|4561799024|0|*CVCOUNTER*|2053|2003|1611741916|11|0|0|0|0|*|4440366347|0|*CVCOUNTER*|3001|2003|1611741916|20|0|0|0|0|*|0|0|*CVCOUNTER*|3006|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3007|3001|1611741916|1|0|0|0|0|*|0|0|*CVCOUNTER*|3003|3001|1611741916|14|0|0|0|0|*|0|0|*CVCOUNTER*|3002|3003|1611741916|13|0|43733|0|0|*|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6009|3002|1611741916|6|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|400|3003|1611741916|0|0|0|0|0|*|0|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signature Processed|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Signatures|36912|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Signatures Found in DDB|6821|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Application Data size|4794095616|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Processed Data size|4236506352|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|New Data size|3801076660|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Dropped Data size (percent[10.28])|435429692|0|*CVCOUNTER*|10002|400|0|0|0|0|0|0|Src-dedupe signature count|43733|0|*CVCOUNTER*|10002|400|1611741916|0|0|0|0|0|Non-dedupable data size|308200952|0|*CVCOUNTER*|2008|505|1611741916|3|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2006|505|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|2056|508|1611741916|3137|0|318676|0|0|*|4129252216|0|*CVCOUNTER*|2057|508|1611741916|12|0|318675|0|0|*|4129194680|0|*CVCOUNTER*|5002|2057|0|0|0|0|0|0|*|0|0|*CVCOUNTER*|5005|2057|0|10|0|0|0|0|*|4115470926|0|*CVCOUNTER*|6010|5005|1611741916|0|0|0|0|0|bbksw16k-219|0|0|*CVCOUNTER*|6014|5005|1611741916|2|0|0|0|0|*|4120279180|0|

2748  1858  01/27 12:05:16 45098

|*273978*|*Perf*|45098| =======================================================================================

|*273978*|*Perf*|45098| Job-ID: 45098            [Pipe-ID: 273978]            [App-Type: 13]            [Data-Type: 1]

|*273978*|*Perf*|45098| Stream Source:   bbksw16k-219

|*273978*|*Perf*|45098| Network medium:   SDT

|*273978*|*Perf*|45098| Head duration (Local):  [27,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| Tail duration (Local):  [27,January,21 11:12:56  ~  27,January,21 12:05:16] 00:52:20 (3140)

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|     Perf-Counter                                                                     Time(seconds)              Size

|*273978*|*Perf*|45098| ----------------------------------------------------------------------------------------------------------------------------------------

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| NDMP Remote Server

|*273978*|*Perf*|45098|  |_NDMP Data Receiver..............................................................      3139                7479872512  [6.97 GB] [7.99 GBPH]

|*273978*|*Perf*|45098|    |_Reader Data Socket[percent:99.41].............................................      3120                         

|*273978*|*Perf*|45098|      |_Data Server Wait Time[percent:99.23]........................................      3114                         

|*273978*|*Perf*|45098|      |_Data Server Read Time[percent:0.19].........................................         6                         

|*273978*|*Perf*|45098|    |_Reader Pipeline Modules[percent:0.58].........................................        18                         

|*273978*|*Perf*|45098|      |_Pipeline write[percent:0.16]................................................         4                         

|*273978*|*Perf*|45098|      |_Buffer allocation[percent:0.39].............................................        12                         

|*273978*|*Perf*|45098|      |_CVA Wait to received data from reader.......................................      3135                         

|*273978*|*Perf*|45098|      |_CVA Buffer allocation.......................................................         -                         

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768  [7.04 GB]  [Samples - 568683] [Avg - 0.001347] [33.08 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Compression.......................................................       150                7557329232  [7.04 GB]  [Samples - 568682] [Avg - 0.000264] [168.92 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Signature module..................................................        32                4561799024  [4.25 GB]  [Samples - 568682] [Avg - 0.000056] [477.96 GBPH]

|*273978*|*Perf*|45098|        |_SDT-Head: Signature Compute...............................................        11                4440366347  [4.14 GB] [1353.41 GBPH]

|*273978*|*Perf*|45098|        |_Src-side Dedup............................................................        20                         

|*273978*|*Perf*|45098|          |_Buffer allocation.......................................................         1                         

|*273978*|*Perf*|45098|          |_Passing to next module..................................................         1                         

|*273978*|*Perf*|45098|          |_Sig-lookup..............................................................        14                         

|*273978*|*Perf*|45098|            |_SIDB-Lookup...........................................................        13                            [Samples - 43733] [Avg - 0.000297]

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsert[bbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|              |_SIDB:CL-QueryInsert[bbksw16k-219]...................................         6                         

|*273978*|*Perf*|45098|            |_Source Side Dedupe stats..............................................         -                         

|*273978*|*Perf*|45098|              |_[Signature Processed]...............................................         -                     43733  [42.71 KB]

|*273978*|*Perf*|45098|              |_[New Signatures]....................................................         -                     36912  [36.05 KB]

|*273978*|*Perf*|45098|              |_[Signatures Found in DDB]...........................................         -                      6821  [6.66 KB]

|*273978*|*Perf*|45098|              |_[Application Data size].............................................         -                4794095616  [4.46 GB]

|*273978*|*Perf*|45098|              |_[Processed Data size]...............................................         -                4236506352  [3.95 GB]

|*273978*|*Perf*|45098|              |_[New Data size].....................................................         -                3801076660  [3.54 GB]

|*273978*|*Perf*|45098|              |_[Dropped Data size (percent[10.28])]................................         -                 435429692  [415.26 MB]

|*273978*|*Perf*|45098|              |_[Src-dedupe signature count]........................................         -                     43733  [42.71 KB]

|*273978*|*Perf*|45098|              |_[Non-dedupable data size]...........................................         -                 308200952  [293.92 MB]

|*273978*|*Perf*|45098|      |_SDT-Head: CRC32 update......................................................         3                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000009] [4614.73 GBPH]

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules[MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  [3.85 GB]  [Samples - 318676] [Avg - 0.009844] [4.41 GBPH]

|*273978*|*Perf*|45098|  |_SDT-Tail: Writer Tasks..........................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|    |_DSBackup: Update Restart Info.................................................         -                         

|*273978*|*Perf*|45098|    |_DSBackup: Media Write.........................................................        10                4115470926  [3.83 GB] [1379.82 GBPH]

|*273978*|*Perf*|45098|      |_SIDB:CommitAndUpdateRecs[bbksw16k-219]......................................         -                         

|*273978*|*Perf*|45098|      |_Writer: DM: Physical Write..................................................         2                4120279180  [3.84 GB] [6907.16 GBPH]

|*273978*|*Perf*|45098|

 

If I see it right the NDMP Server is always waiting for Data but I am not completly sure. I am thankful for a little help in this case. 

 

Kind Regards

Florian

icon

Best answer by R Anwar 1 February 2021, 10:02

View original

6 replies

Userlevel 7
Badge +14

Hi Florian

Thanks for the question, please take a look at this page for configuring multiple streams:
https://documentation.commvault.com/commvault/v11_sp20/article?p=19631.htm
 

We would need to check CVNDMPRemoteServer.log for NDMP performance, but best bet beyond that is to raise a support case and we can take a detailed look and then advise on how we may be able to improve performance.

Thanks,

Stuart

Badge +2

Hi Florian,

Antivirus on netapp CIFS will be worth checking.

Also, in schedule → advanced options, you can adjust some NAS options

 

Userlevel 2
Badge +3

Hi Florian,

 

The problem here is the data coming from NDMP(NetApp).

 

|*273978*|*Perf*|45098|      |_SDT: Receive Data...........................................................       766                7557386768  [7.04 GB]  [Samples - 568683] [Avg - 0.001347] [33.08 GBPH]

 

You would need to check the NDMP DUMP speed on the NetApp for the volume you are backing up via CV.

Badge

Hi

What you are seeing is common in extremely dense volumes packed with millions of small files.  By default NMDP is a oriented at the volume so the volume contents are streamed by the NAS/NDMP service and that can be the bottleneck.  Use a snapshot to offload the NDMP dump is generally helpful in improving the speed, but when dealing with dense file volumes you may need to try another option.

 

Please take a look at this newer feature we added for NetApp C-mode that was designed for cases like yours.  Under the covers we take the policy and break it into some smaller subclients / to filter the volume file structure so we can separate it into different backup calls to effectively multiple concurrent NDMP jobs ( each aligned to a group) against the same volume.  It produces multiple streams from the same logical volume to increase the overall performance.  

Brock

 

Perform a Multi-Streaming Backup Within a Content Path

For NDMP NetApp C-mode clients, you can allow multiple data readers to back up an individual content path on a subclient. The new multi-streaming support on individual content paths works in conjunction with the existing multi-streaming support for backing up multiple content paths.

Multi-streaming within individual content paths improves the backup performance of large volumes.

For more information, see Configuring Multiple Streams for Backups.

Userlevel 1
Badge +4

@shailu89 This option is not usable for me because we ware using the NDMP Agent and not the NAS Agent.

 

@Brock We are using IntelliSnap and so we are using Snapshots on the Storage System. 

 

Last Friday I checked the whole Logs again and saw that the various Dumpphases are taking a lot of time but the Transfer itself is also slow.

I checked the Option for Multiple Stream Backups but I guess this did not work at all. Here is a quick view of the Perfanalysis Log:

Total Data Write: 2872362622959 [2675.10 GB] [5735.76 GBPH]
 Stream Count: 1
 
 
 Remediation(s): 
 --------------
 
 Stream 1:
 Source: bbksw16k-219
 Destination: bbksw16k-219:BBKSW16K-219.kirchheim.bickhardt-bau.de
 
 

----------------------------------
| READS FROM THE SOURCE ARE SLOW |
----------------------------------
    - Increase the number of data readers from the subclient. Suggested values are 8, 12.
    - Change Application/Read size from the subclient. Suggested values are 512KB,1MB for FS. Refer documentation for Oracle, SQL, VSA.
    - Run CVDiskPerf tool on the source to verify the Disk Performance.

DOCUMENTATION
-------------
http://documentation.commvault.com/commvault/v11/article?p=8580.htm
http://documentation.commvault.com/commvault/v11/article?p=8596.htm
http://documentation.commvault.com/commvault/v11/article?p=8855_1.htm

CONSIDERATION(S)
----------------
Increasing streams to a high value may cause disk thrashing and also use more system resources. 
Changing read app size will cause re-baseline. So increase the value gradually.

 

I will check the recent Log Files during the day and give you a Feedback.

Userlevel 2
Badge +3

Hi Florian,

 

Yes, network is a contributing factor as MA is receiving data at 4.4 GBPH but source is sending at 1+ TBPH.

 

|*273978*|*Perf*|45098|      |_SDT-Head: Network transfer..................................................        12                4129194680  [3.85 GB]  [Samples - 318675] [Avg - 0.000038] [1153.68 GBPH]

|*273978*|*Perf*|45098|

|*273978*|*Perf*|45098| Writer Pipeline Modules[MediaAgent]

|*273978*|*Perf*|45098|  |_SDT-Tail: Wait to receive data from source......................................      3137                4129252216  [3.85 GB]  [Samples - 318676] [Avg - 0.009844] [4.41 GBPH]

Reply