Solved

Aux copy behaviour - scalable resource allocation


Badge +3

Hello all, I have a question about my aux copy behaviour.

We use S3 type storage on site for our backups and aux these to an private cloud provider. I’ve noticed that in the aux copies certain clients jobs will consistently be skipped. Initially I thought the issue was bandwidth as the aux copy never completed. We have improved the bandwidth and throughput is much better now, however the aux copies are still not completing. If I go to the secondary copy and show jobs (unticking time range and excluding “available”) I will have a number of jobs going back weeks, with none of the jobs from that client showing partially available (implying it is part way through copying). Interestingly, today I have started the aux copy with “Use scalable resource allocation” UNTICKED, and those old jobs have immediately been picked up and started copying.

Anyone have any ideas why this would be? I’m curious what impact this will have on my environment. I just don’t get why most jobs were copying and it was somehow not queueing these ones even though it knew they were not copied.

Many thanks!

icon

Best answer by Lucy 3 May 2022, 10:08

View original

If you have a question or comment, please create a topic

11 replies

Userlevel 7
Badge +23

@Lucy what did you have set for the number of jobs to process?

Auxiliary Copy Advanced - Scalable Resource Allocation Online Help

Job Selection

On upgrading to the latest service pack, this option is selected by default. You can specify the number of backup jobs that have to be processed during the auxiliary copy operation. For example, if 100 jobs are specified, and there are 1000 jobs that needs to be copied, then the auxiliary copy job will process 10 batches of 100 jobs each.

Total jobs to process

If you have upgraded to the latest service pack, then by default 1000 jobs are processed. Use the arrows to select a desired number of jobs.

Process jobs between

Select this check box to specify a specific date and time range, and the number of jobs specified for the Total jobs to process option, will be processed during the auxiliary copy operation.

Start Time

Specify the scheduled start date and time for the setting the time range.

End Time

Specify the scheduled end date and time for the setting the time range.

Important:

To run auxiliary copy operation after the jobs in the current time frame are copied, you must create a new schedule or edit the time frame of the existing schedule.

https://documentation.commvault.com/11.24/expert/93901_online_help.html#b93902_auxiliary_copy_advanced_scalable_resource_allocation_online_help

Badge +3

Thanks @Mike Struening it’s set to 1000 and no time range is selected. We’re on 11.25 but I’ve been having the issue for months. One aux copy has around 700 jobs to process and the other is just over 3000 but am seeing same behaviour on both - some clients just completely ignored.

 

Userlevel 7
Badge +23

@Lucy , how many jobs did it have before it started (and then stopped)?  did it have 1700 or so (as in, it copied the first 1000 and left the remaining 700)?

The setting as you have it will only copy 1000 jobs in a run (date range not factored in), though we need to know how many jobs there WERE To Be Copied to confirm this was the cause 😎

Badge +3

Hi @Mike Struening  yes I get what you are saying. When the aux job started there were less than 1000 backup jobs waiting to copy so it’s not the case that it’s done 1000 already and is now on the second batch. What I am seeing is the aux job will run through and get to 99-100%, then drop back down to say 80% as it picks up things that have been backed up since the job started. However if at that point I go and look at what is waiting to be copied, it will still have a load of older jobs listed as to be copied. I can’t choose recopy against them as it’s greyed out. I’m tearing my hair out.

Since yesterday when I started the job without scalable resource allocation ticked, I can see those older jobs now have partially copied beside them (before it was just to be copied) but it doesn’t seem to be progressing very fast. I will have another try limiting the number of streams to try to force it to complete those streams before starting another one.

Userlevel 7
Badge +23

@Lucy , appreciate you looking into this thoroughly!

The feature definitely does not sound like it is working properly.

Can you open a support case and share the incident number with me so I can track its resolution?

Badge +3

Thanks Mike I will open a case but we have our support via a 3rd party provider, will see what they say.

Userlevel 7
Badge +23

That works!  Just be sure to get the Commvault Support case number (I can’t track the 3rd party cases).

Userlevel 7
Badge +23

Hi @Lucy , following up on this one.

Did you end up getting a case opened?  If so, can you share the resolution/case number?

Thanks!

Userlevel 7
Badge +23

Hi @Lucy , following up again, hoping you were able to end up getting a case opened.

Can you share the solution or case number?

Thanks!

Badge +3

@Mike Struening Apologies Mike I have not opened a case in the end, on further investigation with our provider we’ve realised that some of our SQL servers have been configured to take their backups locally as well as the one we are taking with Commvault, which is causing broken backup chains and inflating the size of the backups. We’ve had some success working with the suppliers of those systems to convince them to stop doing that and we are seeing much better performance. So it hasn’t solved the exact question I’ve asked here but it has solved the underlying issue that prompted me to ask this question if that makes sense! Thanks for all your help.

Userlevel 7
Badge +23

That’s great!  Glad you were able to get to the bottom of it 🤓