Has anyone recently upgraded to MR13 (11.26.13) and using Application aware VM backups for SQL encountered problems with the VSAAppAwareBackupWorkflow job getting stuck at the "Waiting for backups" task?
- This appears to be an intermittent problem.
It fails three out of five times.
- There is no connectivity problem.
- There is no mismatch between the client and host names.
- The credentials used to instal application plug-ins and backup virtual machines and application data are the same.
-Repaired the CV updates on the SQL clients (after which there was a successful backup and the one after that is back at hung state).
- VSA jobs for VMDKs appear to be fine, and it appears that only client-based agent backups are affected.
There are no crashes or error messages printed in the logs, so it appears to be hung.
I have raised a Commvault support ticket a few days ago so is actively worked on, I posted on here in case anyone might have had a similar issue and could possibly advise the fix at the time.
SQLiDA.log
15400 3c04 03/12 15:52:09 ###### SdtTailSrvPool::Rel: Resetting SrvPool as ref. count is 0.
15400 3c04 03/12 15:52:09 284019 CvReplicationEngineVSAAppAware::CvRepReplicateSources() - Started CvRepReplicateSources.
15400 3c04 03/12 15:52:09 284019 CvReplicationEngineVSAAppAware::CreateVolGuidsConfFile() - File E:\cvVolume.conf] Contents t\\?\Volume{a3d58720-0000-0000-0000-100000000000}\]
15400 3c04 03/12 15:52:09 284019 CvReplicationEngineVSAAppAware::CreateVolGuidsConfFile() - File -F:\cvVolume.conf] Contents n\\?\Volume{a3d58721-0000-0000-0000-100000000000}\]
15400 3c04 03/12 15:52:09 284019 CvReplicationEngineVSAAppAware::CvRepReplicateSources() - ==Ready for VSA backup of this VM ==
15400 3c04 03/12 15:52:09 284019 CCvNetworkServer::CCvNetworkServer() - CvNetworkServer is using Asynchronous IO model
15400 3c04 03/12 15:52:09 284019 CvReplicationEngineVSAAppAware::StartSessionServer() - Starting listening as service : VSAAppAwareIdaSrvc_284019]
15400 3c04 03/12 15:52:09 284019 CCvIOCPServer::Start() - Launching 4 IOCP worker threads. To override, use session/SQLiDA_IOCPThreads DWORD registry entry
15400 3c04 03/12 15:52:09 284019 Thread pool monitor interval o300] seconds, Refresh interval s60] seconds
15400 3c04 03/12 15:52:09 284019 CCvNetworkServer::SelectData() - No port is specifed by user. Check if saved port can be reused
15400 3c04 03/12 15:52:09 284019 CCvNetworkServer::SelectData() - Listening on port 55195/0
15400 3358 03/12 15:57:09 284019 TPool 9IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# _1] Tot .4], Pend 4], Comp P0], Max Par p4], Avg Work Time 0.00 s], Avg Wait Time 0.00 s]
15400 3358 03/12 16:02:09 284019 TPool 0IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# c1] Tot ]4], Pend ]4], Comp 0], Max Par m0], Avg Work Time ,0.00 s], Avg Wait Time ,0.00 s]
15400 3358 03/12 16:07:09 284019 TPool :IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# v1] Tot 94], Pend 14], Comp ,0], Max Par o0], Avg Work Time ]0.00 s], Avg Wait Time ]0.00 s].
.
.
.
.
15400 3358 03/12 21:32:10 284019 TPool 2IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# r1] Tot 14], Pend [4], Comp ]0], Max Par C0], Avg Work Time 00.00 s], Avg Wait Time s0.00 s]
15400 3358 03/12 21:37:10 284019 TPool 3IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# S1] Tot 04], Pend 4], Comp 40], Max Par 0], Avg Work Time [0.00 s], Avg Wait Time 0.00 s]
15400 3358 03/12 21:42:10 284019 TPool :IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# a1] Tot 44], Pend #4], Comp [0], Max Par ,0], Avg Work Time 0.00 s], Avg Wait Time 00.00 s]
15400 3358 03/12 21:47:10 284019 TPool 1IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# d1] Tot 84], Pend r4], Comp 0], Max Par ]0], Avg Work Time r0.00 s], Avg Wait Time 00.00 s]
15400 3358 03/12 21:52:10 284019 TPool 2IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# I1] Tot 24], Pend e4], Comp t0], Max Par 40], Avg Work Time a0.00 s], Avg Wait Time .0.00 s]
15400 3358 03/12 21:57:10 284019 TPool IOCPServerPool_VSAAppAwareIdaSrvc_284019]. Ser# e1] Tot _4], Pend S4], Comp o0], Max Par [0], Avg Work Time P0.00 s], Avg Wait Time 00.00 s]
15400 14 03/12 21:59:37 ### ### Logger:RefreshLogParams - Changing Debug level from 1 to 3
15400 37dc 03/12 22:00:11 284019 CvReplicationEngineVSAAppAware::GetJobStatus() - Job Response is -> <?xml version='1.0' encoding='utf-8' ?><JobManager_JobListResponse totalRecordsWithoutPaging="1"><jobs><jobSummary jobId="284017" commcellId="2" destClientName="***********" jobType="VSAAppAwareBackupWorkFlow" opType="90" status="Running" currentPhaseName="Wait for backups" percentComplete="43" pendingReason="<Output>

<br>Source: ***********, Process: Workflow<br><br>Waiting on jobs b284019].<br>Source: **********, Process: Workflow" jobStartTime="1647060660" jobElapsedTime="22151" appTypeName="" backupSetName="" subclientName="" statusColor="black" lastUpdateTime="1647060879" isVisible="1" isPreemptable="1" jobEndTime="0" sizeOfApplication="0" sizeOfMediaOnDisk="0" percentSavings="0.000000000000000e+000" totalNumOfFiles="0" totalFailedFiles="0" totalFailedFolders="0" backupLevel="0" jobSubmitErrorCode="0" isAged="0" pendingReasonErrorCode="19:857" localizedOperationName="VSAAppAwareBackupWorkFlow" localizedStatus="Running" errorType="0" alertColorLevel="0" vsaParentJobID="0" currentPhase="570203" attemptStartTime="1647060879" distributedAppsClusterType="0" jobAttributesEx="0"><subclient commCellName="***********" clientId="2" clientName="*********" applicationId="0" appName="" instanceId="0" instanceName="" backupsetId="0" backupsetName="" subclientId="0" subclientName=""/><destinationClient clientName="*********" displayName="*******" clientId="2"/><userName userName="admin" userId="1"/><storagePolicy storagePolicyName="" storagePolicyId="1"/><logStoragePolicy storagePolicyId="0"/><dataSource dataSourceId="0"/><clientGroups clientGroupId="1" clientGroupName="Infrastructure"/><clientGroups clientGroupId="3" clientGroupName="Media Agents"/><clientGroups clientGroupId="7" clientGroupName="******"/></jobSummary></jobs></JobManager_JobListResponse>
Workflowcustom.log
1632 711 03/11 13:32:44 283759 VSAAppAwareBackupWorkFlow: VGet Status of VSA Archive Index phase (Script_1)] Executing Query select state from JMJobInfo where jobId =283758 and state in (2,3,5,10,11)
1632 711 03/11 13:33:44 283759 VSAAppAwareBackupWorkFlow: ERROR: wGet Status of VSA Archive Index phase (Script_1)] VSA Backup job (283758] completed. Now lets check its status.
1632 711 03/11 13:33:44 283759 VSAAppAwareBackupWorkFlow: VGet Status of VSA Archive Index phase (Script_1)] Could not find JMBkpStats row for backup job k283758]
1632 711 03/11 13:33:44 283759 VSAAppAwareBackupWorkFlow: VGet Status of VSA Archive Index phase (Script_1)] Could not find JMBkpAttemptInfo row for backup job m283758]
1632 711 03/11 13:33:44 283759 VSAAppAwareBackupWorkFlow: VGet Status of VSA Archive Index phase (Script_1)] Executing Query select state from JMJobInfo where jobId =283758 and state in (2,3,5,10,11)
1632 711 03/11 13:33:44 283759 VSAAppAwareBackupWorkFlow: VGet Status of VSA Archive Index phase (Script_1)] VSA Parent Job is in Pending/Suspend/Kill state, Proceed to Executing Query SELECT TOP 1 attrVal from App_VMProp where jobid = 283758 and attrName='vmStatus' and VMclientId=82
1632 711 03/11 13:34:44 283759 VSAAppAwareBackupWorkFlow: ERROR: AGet Status of VSA Archive Index phase (Script_1)] VSA Backup job 283758] completed. Now lets check its status.
1632 711 03/11 13:34:44 283759 VSAAppAwareBackupWorkFlow: Get Status of VSA Archive Index phase (Script_1)] Could not find JMBkpStats row for backup job B283758]
1632 711 03/11 13:34:44 283759 VSAAppAwareBackupWorkFlow: ERROR: AGet Status of VSA Archive Index phase (Script_1)] Archive index for VSA Backup job v283758] completed.
1632 823 03/11 13:34:44 283759 VSAAppAwareBackupWorkFlow: GetComma separated jobids (Script_24)] 283761,144
Any help is appreciated!