@sauvegarde , do you mind sharing the hardware specs for the Commserve? Also, how many other jobs are running at this time?
this will help us narrow things down.
Hello Mike,
Below are the hardware specifications of the CommServ:
OS : Windows 2012R2
CPU : 4CPU
RAM : 24GB
To describe my complete test environment, I also use a proxy vsa and 4 media agents.
Proxy VSA : W2012R2
MA : Redhat 7.6
During my test, no other commvault job is running.
This behavior with the REST APIs is strange because there should be no problem with the simultaneous triggering of 50 subclients.
Salut @sauvegarde
Just to make sure we all understand (or mostly myself), can you confirm or not each statement :
- you use RESTAPI Python scripts to initiate/control backup jobs ?
- you initiate this from the Commserve itself, or from each clients, or from elsewhere ?
- when you have 5 jobs in parallel, it’s working fine
- when you try to have lots more jobs in parallel, then it’s sluggish ?
Hello @Laurent
I will try to answer as best I can to the different points.
- Are you using Python RESTAPI scripts to launch/control backup tasks?
To correct this point, the script used is based on the following examples: https://documentation.commvault.com/v11/essential/45532_samples_for_developer_sdk_for_python.html#polling-job-status
This one triggers a job depending on the parameters given and to perform my tests, the script is called X times with different parameters to run backup.
- You initiate this from the Commserve itself, or from each clients, or from elsewhere ?
The script is triggered from an external server (scheduler server). The goal of my test is to replace the Qcommand backup by the REST API backup.
- When you have 5 jobs in parallel, it’s working fine
It works great !
- when you try to have lots more jobs in parallel, then it’s sluggish ?
I defined 3 steps to confirm the correct operation of the REST API with python.
50 concurrent subclients
100 concurrent subclients
200 concurrent subclients
When I trigger 50 concurrent subclients with REST API, I notice from my CommServ that the jobs appear 6 by 6 and it takes 7min for the 50 backups to be running on the CommServ.
That's why I'd like some feedback if someone is using the rest API with python and has seen a similar anomaly? :)
Thanks for your explanations. I now understand a bit more your concern.
On my environment, we also have an external scheduler, so we needed to interact with Commserve to initiate and control jobs.
Our scripting experts had created custom perl script to generate RESTAPI queries and interact properly, to initiate backup job for subclients, poll for their status, and so on.
As most of the backups were performed during the night, we began to suffer from issues on high load timeslots, with many backups still running (and so beeing checked for their status every 60s) and new jobs beeing added. We had a lot of error 500 and 503…
Webserver logs were full of such queries at the peak activity times.
We then tried to use Commvault’s internal scheduler to offload the webserver, and make sure we really schedule our backups (and all servers _are_ protected, otherwise they’ll be reported in SLAs/reports..).
The strange information that you’re pointing is that they appear ‘6 by 6’, and not 5 or 10..
Well, I’ll follow this topic to see where this could be coming from..
Appreciate the details (and the help from @Laurent as always)!
Might be worth seeing if anyone else has input, though I’m leaning towards creating a support case. I would not expect such a crawling impact. Granted, there is more activity from the overhead, but enough to cause the slowness you are seeing? I don’t believe so.
If you end up creating a support case, share the incident number here so I can track it.
Hi @sauvegarde , hope all is well!
Did you end up opening a support case for this? Were you able to find a solution?
@sauvegarde I don’t see a support case in your name. Were you able to resolve the issue on your own?
One thing to consider is upgrading your OS and SQL version. SQL2019 has improvements to database efficiency.