Solved

Email Alert notifications for VM Backup Failure


Userlevel 1
Badge +4

Hi,

 

We are running v11.22.13 and the alerts have changed behaviour from v11.20.x. We used to get an email formatted as html with the general job status and an attachment as text with the failed vm name and failure reason. This is no longer the case.

Now we get an attachment in html form which is just a repeat (copy) of the main email. So not helpful. Also I would prefer the attachment be in plaintext not html.

I’ve played around with a variety of alert settings such as sending as text instead of html / with or without “detail as attachment”. I have included the various “VM xxxx” tokens. Nothing has produced the list of failed vm’s. Is this working for anyone running the same version as I am?

Cheers,

Ron

icon

Best answer by Javier 17 June 2021, 12:09

View original

23 replies

Userlevel 7
Badge +15

Hi @bRonDoh 

Have you tried the “Send Individual notification for backed up Discovered Virtual machine” option?

 

This will of course send a separate email for individual VM backup failure with details.

Or are you trying to receive a list of failed VMs from the VSA parent job?

Thanks,

Stuart

Userlevel 1
Badge +4

Hi Stuart,

  1. Yes I have tried that. Also tried the “Send individual notification for this alert”. And tried both options enabled.
  2. Yes I do want a list of just vm’s that failed and/or had errors, such as a failure to quiesce the vm. I would like it as either a text file attached to the email alert (this is how it worked previously) or as a list within the body text of the email itself.

Another observation is that the tokens for VM information do not seem to be working. For example the alerts in text format - have “Not Available for these tokens regardless of the success/failure state of any vm in the job :

 <VIRTUAL MACHINE NAME>, <VIRTUAL MACHINE HOST NAME>, <VM STATUS>, <VM FAILURE REASON>, <ADDITIONAL VM INFO>

Also the alert can have a status of “Job Succeeded with errors” and yet the “Failure count” be set to 0 (zero) on an incremental backup of a vm without the vmtools running which would force an quiesce error. If the job is a full then the “Failure count” is set to 1. But in no case is there anything about which vm’s failed or why.

Cheers,

Ron

Userlevel 1
Badge +2

Hi

 

May i know whether this is VSA v2 or V1? Can you please confirm also confirm VMs being backed up are part of entity selection list in alert?

 

Thanks

Madhu

Userlevel 1
Badge +4

Hi Madhu,

 

How do I find whether it is VSA v1 or v2? I can tell you that it is running v11.22.13 on Windows Server 2012R2 (both for the CommCell and the VSA actually). The entity selection list (Monitored Nodes) is set to the subclient I created “Ad-Hoc”. There is no option to select VMs, but the subclient has two vm’s selected by name. One of the vm’s has the vmware tools disabled to force a quiesce error during backup. The alert says the job succeeded with errors and has a failure count of 1 when I do a full backup. But the emailed alert does not provide the details as to which vm failed nor why. This state of reporting is not what existed previously and is affecting all my VM backup jobs, not just this ad hoc test subclient.

Cheers,

Ron

Userlevel 1
Badge +2

Hi Ron

 

Wanted to know indexing version of VSA hypervisor whether its V1 or V2.

  1. Are you looking for VM specific alerts by selecting option “Send Individual notification for backed up Discovered Virtual machine” on alert?

      2.  Or do you have <FAILED OBJECTS> token added as part of your email and expect all failed VMs to be attached in your email? 

If you want VM specific alerts (option 1) , then VMs also need to be selected. To show VMs in client list computer list, you can follow documentation link https://documentation.commvault.com/commvault/v11_sp20/article?p=36317.htm. Need to go to User Preferences from Control panel, go to Client computers tab and Select option “Show Virtual machines”. Relogin to commcell console and you should be able to select VMs from entity selection list of alert. This requirement has always been there and is not a recent change.    

Userlevel 1
Badge +4

Hi Madhu,

How do I determine if a VSA subclient is using indexing version 1 or 2?

I am looking for the previous behaviour where a list of failed vm’s are attached to the email alert. Alternatively it would be fine if the failed vm’s were listed in the body of the email, whether the email is in html or text format.

The options I have tried are outlined above in my reply to Stuart.

I do have the <Failed Objects> token along with all the <VM xxx> tokens - these are part of the default message.

It seems that your suggestion to select specific vm’s would not work in our case where the vm’s in the subclient are selected by tags - this is how our production vm backup subclients are set up. I will investigate your suggestion for the “Ad-Hoc” subclient which is not using tags, but again this lack of reporting is a change in behaviour. Note also that previously the attachment used to be in plain text, whether the email alert was sent in html or text format. However now the attachement is in html format. As well the attachment is nothing more than a copy of the email body text - there is no ‘detail’ in the attachment - this is also a change in behaviour from 11.20.

Can the desired behaviour - failed vm’s listed in the body text of the alert or in an attachment - be achieved? If so what are the required settings?

Thanks,

Ron

Userlevel 1
Badge +4

Test condition #1 :

Entities Selection : individual client vm’s (Test-SLES, SexiGraf)

Alert Criteria : Job Failed,  Job Skipped,  Job Succeeded,  Delayed by 1 Hrs,  Job Succeeded with Errors, Send Individual notification for each backed up Discovered Virtual machine

Reporting Criteria : Immediate Notification, Notify only when job contains failed objects.

Result : No emails sent for any client. Job actually has one successful client (SexiGraf) and one failed client (Test-SLES).

 

Test condition #2 :

Entities Selection : individual client vm’s (Test-SLES, SexiGraf)

Alert Criteria : Job Failed,  Job Skipped,  Job Succeeded,  Delayed by 1 Hrs,  Job Succeeded with Errors, Send Individual notification for each backed up Discovered Virtual machine

Reporting Criteria : Immediate Notification

Result : One email sent with ALL clients listed, both successful and failed. Not exactly the desired behaviour.

 

Test condition #3 :

Entities Selection : VSA SubClient “Ad-Hoc” which has two manually selected/nmaed vm’s.

Alert Criteria : Job Failed,  Job Skipped,  Job Succeeded,  Delayed by 1 Hrs,  Job Succeeded with Errors, Send Individual notification for each backed up Discovered Virtual machine

Reporting Criteria : Immediate Notification

Result : One email sent with NO client vm’s listed, both successful and failed. Attachment is an html duplicate of the html email. Previous versions of the alert would have attached a text file of the failed vm’s.

Userlevel 1
Badge +2

Hi Ron

 

Since what you are looking for is Test condition #3, we will try it internally to see if issue is reproduced here.  Please give me a day or two to check internally.

Thanks

Madhu

Userlevel 1
Badge +4

Hi Madhu,

Actually Test Condition #3 is NOT what we are looking for. These are just me reporting on the outcom of various configurations in an attempt to verify the behaviour of various settings. Then that behaviour can be tested by, and compared to, others. Also maybe Commvault techs can see if the reported outcome is congruent with expected outcome.

So what are we looking for? The following :

Desired Condition #0 :

Entities Selection : VSA SubClient “Windows VM’s” which uses vmware Tags to select vm’s for backup, eg “Windows Monthly”.

Alert Criteria : Job Failed,  Job Skipped,  Job Succeeded,  Delayed by 1 Hrs,  Job Succeeded with Errors

    NOTE : I do not have a “Desired” setting for the options “Send individual notification for each backed up Discovered Virtual machine” as I don’t know what this setting will actually, or is supposed to, produce. Furthermore there is the matter of the subject criteria when this option is enabled - ie the “Notification Criteria” such as “Notify only when job contains failed objects”, should that be enabled or disabled to get the result we want and used to get. That is part of this reporting, testing and discovery process we are in :-).

Reporting Criteria : Immediate Notification

DESIRED Result : One email sent with a list of FAILED/ERRORED client vm’s. List can be either in the body text of the email (regardless of whether the body text is formatted as html or plain text) or as a text (not html) attachment.

I hope that this makes clear the goal we are trying to achieve. Also I really appreciate the time and effort you and others have put into this investigation.

Cheers,

Ron

 

Userlevel 1
Badge +2

Hi Ron

 

Can you please opena  support incident? That would make it easier to track. 

 

Thanks

Madhu

Userlevel 6
Badge +15

@bRonDoh to check the version of the VSA indexing, select your VSA client, right-click on it to get its properties, and you should see the Indexing version in the Client Information part.

Below is an example from the Commcell console of a v1 VSA client.

 

Regarding the mail result, do you expect this one below ?


Backup Job Summary Report
Backup Jobs Failure - Last 24h

Report generated on 04/06/2021 9:00:03
Version: 11
CommCell ID: xxxxx
CommCell:  yyyyyyyyyyyyyyy

-- Report Criteria --

  • Job ID:All
  • Group By:Client
  • Backup Types:Full, Incremental, Differential, Synthetic Full
  • Job Status:Failed
  • Include:Failure Reason, Job Description, Include Disabled Activity, Include Deconfigured, Protected Virtual Machines, Global Storage Policies
  • MediaAgents:All
  • Computers:All
  • Agent Types:All
  • Storage Policy (Copy):All
  • Throughput Unit:GB/Hour
  • Locale:English
  • Show Time In TimeZone: CommServe
  • Last 24 Hours

 

Summary
Client Host Name Total Jobs Unsuccessful Size of Application (Compression Rate) Data Written
(Space Saving Percentage)
Start Time End Time Protected Objects Failed Objects Failed Folders
All   3 3 0 (0%) 0(0%) 04/05/2021 21:00:24 04/06/2021 3:03:22 0 0 0
client hostname 3 3 0 (0%) 0(0%) 04/05/2021 21:00:24 04/06/2021 3:03:22 0 0 0


 

Client Agent /
Instance
Backup Set /
Subclient
Job ID (CommCell)
(Status)
Type Scan Type Start Time
(Write Start Time)
End Time or Current Phase
(Write End Time)
Size of Application (Compression Rate) Data Transferred Data Written
(Space Saving Percentage)
Data Size Change Transfer Time
(Current)
Throughput (GB/Hour)
(Current)
Protected Objects Failed Objects Failed Folders
Client Virtual Server/
VMware
backupset/
yourmachine
3245214*
(F)
INCR N/A 04/06/2021 3:00:26 04/06/2021 3:03:22 0 (0%) 0 0
(N/A)
0% 0:00:00 0 0 0 0
Failure Reason:
  • ERROR CODE [91:53]: No virtual machines were discovered for this Subclient. Please check the Subclient content.
    Source: client, Process: vsdiscovery
Protected Virtual Machines:
No protected Virtual Machines found for this job

 

 

Userlevel 1
Badge +4

Hi All,

Sorry for the delayed reply - I was hit by a flu-like bug (not Covid-19 thankfully) last week.

To answer Madhu’s question regarding indexing version of the VSA - it is v1. Thanks to Laurent for supplying the clue to find this tidbit of info. Also Laurent : no that is not what I am looking for. I think the difference is that you are generating a “report” whereas I am using the job alert via email function. I expect to get an email with a list of vm’s that failed and the reason they failed. Thanks for your help!

Ron

Userlevel 7
Badge +23

Glad you are better!

Let me know if I have your desired outcome correct:

  • You want an Alert (not a report) emailed with a list of vm’s that failed and the reason they failed.
  • Whereas reports may exist, that’s not what you are looking for: you just want a plaintext alert that lists out each vm that failed to get backed up and why it failed

Is that correct?  We’re likely going to have to request a CMR from dev, so the better detailed a write up I can pass along, the more likely it will be permitted to enter.

Userlevel 1
Badge +4

Hi Mike,

Yes, I would like the Job Notification to send an email. I would like the email to contain information about vm’s that “failed” or “succeeded with errors”. The list of these vm’s could be in the body text of the email or as a text attachment to the email. The email itself can be either html or text formatted. This is what the prior version (ie v11.20) produced - an html formatted email with a plaintext formatted attachment.

This should not require a CMR since the desired output was already being produced under v11.20. Also there are tokens that suggest the ability to produce this content as given in the example below:

 Alert: <ALERT NAME>
 Type: <ALERT CATEGORY - ALERT TYPE>
     Detected Criteria: <DETECTED CRITERIA>
     Detected Time: <TIME>
     CommCell: <COMMCELL NAME>
     User: <USER NAME>
 
      Job ID: <JOB ID>
     Status: <STATUS>
     Client: <CLIENT NAME>
     Agent Type: <AGENT TYPE NAME>
     Instance: <INSTANCE NAME>
     Backup Set: <BACKUPSET NAME>
     Subclient: <SUBCLIENT NAME>
     Backup Level: <LEVEL>
     Storage Policies Used: <STORAGE POLICIES USED>

     Scheduled Time: <SCHEDULE TIME>
     Start Time: <START TIME>
     End Time: <END TIME>
     XFER Time: <XFER TIME>
     Bytes Written: <DATA WRITTEN>

     Error Code: <ERR CODE>
     Failure Reason: <FAILURE REASON>
     Protected Counts: <PROTECTED COUNT>
     Failed Counts: <FAILED COUNT>

     <ADDITIONAL VM INFO>
                       VM Name:   <VIRTUAL MACHINE NAME>
                       VM Status:  <VM STATUS>
                       Failure Reason: <VM FAILURE REASON>

 

And here is the actual email sent based on the above, notice the “Not Applicable” entries despite the “Failed Counts:” being non-zero:

Alert: Data Protection - ESX - Windows 
 Type: Job Management - Data Protection 
          Detected Criteria: Job Succeeded with Errors 
          Detected Time: Sun Apr 11 20:16:52 2021 
          CommCell: cv01 
          User: Ron Neilly 
 
          Job ID: 37217 
          Status: Completed w/ one or more errors 
          Client: vCenter-w36 
          Agent Type: Virtual Server 
          Instance: VMware 
          Backup Set: defaultBackupSet 
          Subclient: Windows 
          Backup Level: Incremental 
          Storage Policies Used: Monthly-Full-to-Disk-Keep-12 
 
          Scheduled Time: Sun Apr 11 19:10:03 2021 
          Start Time: Sun Apr 11 19:10:06 2021 
          End Time: Sun Apr 11 20:16:51 2021 
          XFER Time: 0 Hour(s), 50 Minute(s), 42 Second(s) 
          Bytes Written: 228.92 GB 
 
          Error Code: Not Applicable 
          Failure Reason: Not Applicable 
          Protected Counts: 661190 
          Failed Counts: 2 
 
          Not Applicable
                       VM Name:   Not Applicable
                       VM Status:  Not Applicable
                       Failure Reason: Not Applicable

 

And for comparison purposes here is an email alert that has been working for years - sent as an html email with a plaintext attachement of the failed objects :

 
 
 
 
 
Data Protection - ESX - Windows
  • Type: Job Management - Data Protection
  • Detected Criteria: Job Succeeded with Errors
  • Detected Time: Sat Jan 9 19:55:14 2021
  • User: Ron Neilly
  • Job ID: 35194
  • Status: Completed w/ one or more errors
  • Client: vCenter-w36
  • Agent Type: Virtual Server
  • Instance: VMware
  • Backup Set: defaultBackupSet
  • Subclient: Windows
  • Backup Level: Incremental
  • Storage Policies Used: Monthly-Full-to-Disk-Keep-12
  • Scheduled Time: Sat Jan 9 19:00:09 2021
  • Start Time: Sat Jan 9 19:00:13 2021
  • End Time: Sat Jan 9 19:55:12 2021
  • XFER Time : 0 Hour(s), 28 Minute(s), 41 Second(s)
  • Bytes Written : 225.59 GB
  • Protected Counts: 573053
  • Failed Counts: 1
  • Failed Objects (attached) : See attachment -> BackupFailed ( Job Id - 35194 ).txt
 

 The attachement contains :

ubco-sjcip03 Failed Unable to quiesce guest file system during snapshot creation
Edited to add that the attachment is named “BackupFailed (Job ID - #####).txt”
Hope that this makes things a bit clearer!
Cheers,
Ron

 

Userlevel 7
Badge +23

I couldn’t ask for a better reply and write up!

I’ll pass this up and see what we can do for you.

Userlevel 7
Badge +23

@bRonDoh , following up that there is some conversation here.  I’ll keep you posted.

Userlevel 7
Badge +23

Hey @bRonDoh !  I heard back from dev and they cannot reproduce.

they asked if you could open a support case (share the case with me) and get a “customer DB with alert configured which was working in SP20, run a sample backup on 22 with failure to reproduce issue and get us logs?”

Can you create a case (refer to this thread) and provide the above db/logs?

Userlevel 7
Badge +23

Hey @bRonDoh , following up to see if you had a chance to create an incident with the dev requested items?

Userlevel 7
Badge +23

Hey @bRonDoh , I checked your cases and am not seeing anything, yet.  Let me know the incident number once you open a case and I’ll get it handled accordingly.

Thanks!

Userlevel 7
Badge +23

Following up on this @bRonDoh .  Let me know if this is still an issue, and if you were able to get a case opened for the issue so I can follow up accordingly.

Thanks!!

@Mike Struening 
I’m working on similar . Should have news tomorrow UK morning.

Userlevel 7
Badge +23

Thanks @Jacek Piechucki !!

Userlevel 4
Badge +8

Hi guys,

 

For this to work we need to create the Alert from the Main Alert Management Wizard and associate it to the VMs pseudoclients - Operation should be as follows;

 

  • Ensure VMs are showing within the “Client Computer” List - Right Click within “Client Computers” and select “Customize View”. Then tick the option to “Show Virtual Machines”
  • Open the Alert Management and create a new Alert.
  • Within its associations, tick the VMs you would like to get Alerts for
  • Under “Threshold and Notification Criteria” tick the option “Send Individual notification for each backed up Discovered Virtual Machine”

You should start receiving alerts for the individual VMs inlcuding the VM Details related Tokens.

 

This information is also available in the documentation

https://documentation.commvault.com/commvault/v11/article?p=5300.htm

 

We had a case created for this and we have got successful results following the steps above.

Reply