Skip to main content

One of the things that is going to blow up here in my company soon is some jobs are taking over 24 hours to backup x amount of data. the clients which have huge durations are usually windows or linux file systems with variences of data being backed up from 2 TB to 8 TB+.  My systems and networking department don't like to help unless all evidences point to an actual problem with either the network, or the server itself that is causing the latency of duration. 

Is there an easy way on commvault’s end to point out that “this client is running at this x GB/Hr, compared to when it was X GB/Hr” and prove that its not the backup software is causing the issue? I know when a active job runs it shows average throughput, and when double clicking on the job id it shows a percentage (Read: X%, Network X%) but I don't know if that's showing the percentage of what's causing the job s duration or what its trying to show. 

I want to avoid tidious work of going back and forth with my departments on each individual server that is having higher then usuall duration for backups with out making an excell sheet of the backup history showing where it started to become slow, and a ticket everytime with commvault to prove its not commvault.

Hi @TP_Erickson ,

In the Job Details you may see %’s displayed (Depending on Software Version) of Read, Network, Deduplication. - This is usually a good indicator for the Job performance.

Checking the Job Attempts tab can show the duration of the “Scan” and “Backup” phases, this can also highlight if these phases fail and run multiple times.

 

Have you check the logs for any statistics counters? - You may see them in some logs if you filter for: Stat-
The CvPerfMgr.log on the Media Agent is a useful resource to check for Job Performance stats also.

 

For FileSystem Backups I’d suggest to check how many Data Readers are configured for the Subclient, also ensure that the option “Allow multiple data readers within a drive or mount point” is enabled. - Configuring File System Multi-Streaming (commvault.com)

 

If you need to check network speeds between the Client and MA you can use this Workflow to test the speeds: Testing Network Performance Between a Client and Its Associated MediaAgents (commvault.com)

 

Let us know how any findings you may have on this.

 

Best Regards,

Michael


@MichaelCapon The  CvNetworkTestTool GUI is great, i just downloaded it this morning and tested it out. The only concerning thing i have is that the speeds its saying is transferring the packets are not the speeds we are seeing for certain clients. 

Now i am  going to be honest, I don't get all the jargon Commvault uses and I am ashamed because it should be something that i should know by now.(tells you how fun training is within my company).

For FileSystem Backups I’d suggest to check how many Data Readers are configured for the Subclient, also ensure that the option “Allow multiple data readers within a drive or mount point” is enabled. - Configuring File System Multi-Streaming (commvault.com)

I don't know the actual purpose of the data reader. Commvault definition of a data reader and the internet in general still has me confused as to what it does compared to Device Streams. When using the CvNetworkTestTool this morning we tried altering the data reader on a subclient to see if there were to be any different results with increasing or decreasing from the automatic value commvault assigns the subclient. We saw little to no results in changes. And I get it, it is only throwing a small percentage of a data packet, so its not the best results per say to give the data readers an accurate test.  Does the data reader act like a flood gate to allow more traffic of data to come through from the client?. Does it determine the amount of data readers needed based on the clients drives it has? What happens if I increase the data reader to lets say 9999, would it allow maximum amount of data to be read through it , or would it kill the client? What happens when “Allow multiple data readers within a drive or mount point” is checked or unchecked?  Can one drive have more then one data reader?

Device Streams I get it. For my situation which we use tape libraries, we can assign one device stream to one drive, a 1:1 ratio.  Still rough about multiplexing factor, but it allows multiple data streams to access a device stream if I member it correctly. So if I have a subclient then that is assigned to a storage policy that has a device stream of 5 (taking 5 tape heads) with a multiplex of 10, if results of backups have their average throughput and duration  detreated, and haven't changed the device streams, is it the data readers that are not able to allow the traffic of data to keep up, or is it the write speeds of the drives? I don't believe it would be a write issue, because unless Commvault somehow controls the write speed on the drives itself, the only way drive read/write speed is altered is on the tape library itself, which we have not touched.

for the Stat-, thanks for the information on that that will be handy to look at during live backups. Im assuming what we want to focus on in those logs is the stat- ID nPipeline Write time]?

I think the job details you misunderstood my point. when its telling me the Read, Network, Deduplication. when it is backing up, is this showing me what is causing the duration of a backups to be so long, or is it showing something different. for example :

(Read: 86.97%, Network: 13.03%)

is the read showing that the reason for the duration of this backup is because 86.97% is because Commvault is taking its time to read/write the data? Is the 13.03% stating that it has improper bandwidth from the network?  


@TP_Erickson , think of data readers as connections from the client and Device Streams on the library.

If you have a Windows File System client with 2 attached hard drives, you can allocate multiple readers to the subclient (or split them up) to speed up the backups.

Here’s a doc that covers every aspect of streams, etc.  

https://documentation.commvault.com/commvault/v11/article?p=10969.htm

You have the rest figured out.  Throwing more streams/readers is not a solution…..it could make things worse.  Sending too many riders to a bus with few seats doesn’t mean the passengers arrive quicker (or happier).  You want to fit the readers to match the physical data locations and the streams to match the library (generally, you match to the number of spindles).

For the job details, I believe you are correct again, though I’d like to see the full output to be sure if you could share.


Reply