Skip to main content

Hi Team,

 

I have an interesting scenario with occasional backups complaining with an error like this one:-

 

“Backup Pending. Failed to update metadata information on the CommServe”

 

Unfortunately the clients involved vary, and so it looks like a generic issue, either with the Commserve, maybe the network, or simply with load.

 

We are a pretty busy environment (3000 jobs per night), and so what I really want to do is look at some kind of Workload Balancing report. Ideally I would be able to see how many jobs are running at any given times, or at least roughly indicated. And I could do with the schedule name if that is possible.

My theory being, is that if I’m seeing too may jobs submitted at 7pm for example, and very few at 9pm, then I can rebalance the schedules more evenly.

 

However, for all the information and reports that are out there, I haven’t yet found a report, or a graph, or whatever which shows this information. We also have a shedload of schedules, so although I can manually and painstakingly wade through everything, some kind of Workload Balancing report would be very helpful.

So cut to the chase …. is anyone out there aware of a report which shows job and\or  schedule activity over say a 24 hour period. Anything showing peaks and troughs would be even better, from a visual perspective.

 

Thanks

Hi @MountainGoat,

 

Thank you for your question!  I want to address this question with a multi-part answer:

 

  1. Regarding the Reporting request:
    1. I’ve gone through the existing reports that we have on offer, and it doesn’t appear that we have an existing report that fits the criteria you’re looking for exactly.
    2. Although we do have several ways of reporting on the jobs within a 24-hour period and their start times, none of them expose the configured Schedule Policy that started the job.
    3. This information is of course in our database, so it would be possible to create a report (or amend an existing report) to include this, although I don’t see something that I can provide to you today that has this level of granularity.
  2. Regarding the issue of excessive load:
    1. To reduce the number of job failures during periods of higher load, it may help to set a high watermark on job streams.
    2. This can be set in the Commcell Console in the Control Panel > Job Management to define a maximum number of streams that can run concurrently.
    3. If additional jobs are launched while that stream threshold is met, the jobs will remain in a ‘Waiting’ state until some of the existing streams are completed, at which point they will begin running automatically.
      1. This may significantly cut down the amount of job failures you receive as a result of high load.
  3. A potentially useful tip for visualizing schedule policy balance:
    1. We can use a combination of Filtering and Sorting in the Commcell Console to see how many associations we have to reach Schedule Policy.
    2. This could be useful assuming your Schedule Policies are defined by their job start time.
    3. For example, you can navigate to the Schedule Policies section of the UI and Filter the Type column for oData Protection].
    4. Next, you can Sort by the <Associated Objects] which is a total number of associations to that schedule.
      1. This may not be perfect, but it will at least show at-a-glance how many backups are going to be triggered by each schedule, which may highlight which schedules are being more heavily utilized than others.

 

I think your initial ask for an ability to report on this is a valid one.  As it seems we don’t have an offering for this today, I will discuss this internally to see if it’s something we can add to the product in the future.

 

-Brian Bruno


Thanks for the thorough feedback Brian.

 

I have already tinkered with the high watermark, and the same for streams per MA.

There are quite a few ways of working through congestion, but a report showing busy periods would be a good compliment to these troubleshooting exercises.

 

I will visit that schedule and associations as per point 3.

 

Regards.

 

 


Reply