Understanding Job Details / Progress / Load Read statistics

Question

Hi,

Like my name state, I’m quite a beginner so please bear with me.

I’m writing this after finding a similar post answered more than a year ago that didn’t quite answer my own question.

https://community.commvault.com/self-hosted-q-a-2/average-throughput-information-read-write-network-ddb-lookup-meaning-2854

I have this job that backup a SMB File share running on my media agent. The file share contains RDS Profiles and HomeDirectory so it contains millions of relatively small files.

The job has been running for 3 days and it’s ETA is at least 7 days.

The Job details / Progress tab / Load portion says : Read 98% Write 0.07% Network 0.47% DDB Lookup 1.30% with a current throughput of 0.001 GB/hr. I have no idea where it got the Average Throughput from because I’ve never seen it over 1GB/hr).

The Subclient job setting / Advanced settings / Performance tab / Number of Data Readers is fixed to 10 data readers.

My questions are :

Read being that high and others start so low, does that means the bottleneck the Read part taking too much time ?
Does that mean I do not have enough Data Readers and augmenting the readers will speed up things or does it mean CV is already overloaded on the reading part and increasing Readers will make things worst ?
If increasing the number of Data Readers is the solution to speed things up:
- Should I set it to Automatically use optimal number of data readers ?
- If it’s best to keep it at a fixed number, What increment would you suggest I should use next ? Skip to 100 readers and see how it goes from there ?

Thank you for all that will take time to answer my questions.

Have a good day.

Damian Andre · Answer

Hey@TheCVNoob,You are right that the 98% read indicates that read is the bottleneck. The problem with backing up millions of files (especially over SMB) is the opening/closing of each file which adds tremendous overhead to a backup operation. The source storage is easy overwhelmed since its managing the locks/unlocks of file resourcesWhat device are files on? would it be possible to protect it at the source rather than via the SMB share?Changing the number of reads could help or hinder as you say. Since the current throughput has dropped to so little - it almost seems like the transfer has stopped and the source storage may be overwhelmed. In your case it may be beneficial to lower the amount of readers as a guess - but this is not an exact science, and you may need to experiment to find the optimal setting. I can say that 100 would be bad...In either case, if its possible to protect the data from the source rather than via an smb export that would be ideal - especially when using block level backups which bypass the file open/close overhead altogether.

Understanding Job Details / Progress / Load Read statistics

2 replies

Reply

Most helpful members this week

Cookie policy

Cookie settings

Reply

Related topics

Renaming backup filesicon

Renamed folder in Azure filesicon

AIX File system Larg directory backup scan failureicon

Backup directory structure only - excluding all filesicon

Client Side Deduplication and migrate hardware on serversicon

Most helpful members this week

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings