Solved

"Alert Data Verification Failure Detected" Alert

  • 19 October 2021
  • 14 replies
  • 325 views

Badge +13

I noticed a new alert was added to my CommServe and I believe this was after an upgrade we did recently to a new FR.

The alert in question is : Alert Data Verification Failure Detected and it is briefly discussed on the link.

https://documentation.commvault.com/11.20/expert/12395_data_verification_faq.html

Few things that come to mind and not completely clear to me.

  1. The alert says that it detected corrupted data on backup disk. It does not tell me the job id nor the path. Looking at the alert in the GUI, It also does not have any additional option to be selected such as job id, storage policy etc. How come?
  2. Job history to the subclient or storage policy brings everything green and no issues with the job themselves. My understanding is that I would potentially have issues trying to restore the particular subclient up to the specific transaction log. Why would job history still shows the job as succesfully completed? 
  3. The alert gives you two options, one to convert the subclient to full during the next backup and one to automatically convert all failure verifications to full. The link points to a workflow to be executed, but upon searching the workflow in the Commvault Store under workflows I can`t find anything related. What are both workflow names? how can I find them?
  4. On my storage policy Data Verification is disabled. Trying to understand the alert considering that the option is disable?
  5. Was this alert introduced in recent FRs? which one specifically?

Appreciate the time.

icon

Best answer by Mike Struening 2 November 2021, 21:32

View original

14 replies

Userlevel 7
Badge +23

Appreciate the post, @dude!  I agree, without the Job ID it’s not exactly useful (without some digging work).

Let me discuss internally and get some of our team to respond here.

Thanks!

Badge

We experience the same here after upgrading from FR 11.20 to 11.24. A few backup jobs that are using synthetic full backups are affected. We decided to make a full backup for these jobs.

Userlevel 1
Badge +3

Hello @dude,

As far as I know this alert has been around since at least FR20. Do you have the system created schedule for ddb verification still running? I believe that the alert triggers based on that. The two workflows do not need to be downloaded from the store. They are “Toggle Automatic Conversion To Full Backup” and “Mark Selected Subclients To Run Full Backup” but are hidden in the GUI.

Badge +13

I do see two system created jobs, though the one verification one is disable and has been for a while as far as I know.

Still do not see the logic behind the alert and my questions remain open. 

Userlevel 7
Badge +23

Hi @dude, based on what @Tim H shared, and the context, I’ll start with you 5 bullets:

  1. The alert says that it detected corrupted data on backup disk. It does not tell me the job id nor the path. Looking at the alert in the GUI, It also does not have any additional option to be selected such as job id, storage policy etc. How come? - This is likely due to dedupe, and how it would be several jobs affected.  The time to pull the list and the list itself would be lengthy.
  2. Job history to the subclient or storage policy brings everything green and no issues with the job themselves. My understanding is that I would potentially have issues trying to restore the particular subclient up to the specific transaction log. Why would job history still shows the job as successfully completed? This sounds like a good CMR….to go back and acknowledge these visually somehow.
  3. The alert gives you two options, one to convert the subclient to full during the next backup and one to automatically convert all failure verifications to full. The link points to a workflow to be executed, but upon searching the workflow in the Commvault Store under workflows I can`t find anything related. What are both workflow names? how can I find them? Similarly, this should be a CMR to update the alert
  4. On my storage policy Data Verification is disabled. Trying to understand the alert considering that the option is disable? The alert is clearly not checking, that should be a CMR as well.  That sounds like an easy check to make, or at add context to the alert.
  5. Was this alert introduced in recent FRs? which one specifically? iirc 11.22

That leaves 3 improvements:

  1. Can we visually mark completed jobs impacted by the corrupted files?
  2. Update the alert to contain the workflow names
  3. IF DV is disabled, then either don’t show the alert or add context

Does that cover your thoughts?

Badge +13

@Mike Struening honestly I do not see how this is a "solution" . It seems to me that the alert was not thought out at all. Like I said, it reports a few things that the alert it self cant explain, nor can I find the info through reports or find the workflows. This is pretty innefficiente from my perspective not only that, it leaves the customer with a lot of questions. 

Anyways looks like this is it for now. Thanks for reviewing this. Hope to see improvements in the future.

Userlevel 7
Badge +23

I agree with you 100%.  My hope is that I capture your concerns, bring them to our Alerts development team, and get your changes considered and hopefully implemented.

I just want to be sure I capture your concerns and ideas accurately so I capture everything for my message to them!

Let me know if I missed anything:

  1. Can we visually mark completed jobs impacted by the corrupted files?
  2. Update the alert to contain the workflow names
  3. IF DV is disabled, then either don’t show the alert or add context
Badge +13

Looks good. The only thing I`d add to item 2 - make the workflow visible, which today aparently is hidden. 

Thank you - 

Userlevel 7
Badge +23

ok, great!  I’ll do that.

Userlevel 7
Badge +23

Hi @dude , I have 2 answered for you (asking for better details from dev on the visual identifier request):

  1. Update the alert to contain the workflow names and make them visible

[dev] Why are you looking for workflow name and their availability on store? The link should itself take you to workflow and run the workflow.

  1. IF DV is disabled, then either don’t show the alert or add context

[dev] The jobs are marked as verification failed by any read operations i.e Synth Full, aux copy or restores. It is not just DV jobs. Even when DV is disabled, if any read operation detects data corruption then we mark those chunks\jobs as data verification failed and those affected subclients will be listed in the alert if the affected job(s) are part of latest cycle. The alert helps to protect the affected subclients by running a new Full on them.

For the Workflow link, can you confirm that the alert links you right to the WF itself?  You should have it already installed by default.

Thanks,

Userlevel 7
Badge +23

And now I have an answer on the visual request.

If you go to View>Jobs on copy, check the Data Verification status column.

Let me know if this works for your needs!

 

Badge +13

Hi @dude , I have 2 answered for you (asking for better details from dev on the visual identifier request):

  1. Update the alert to contain the workflow names and make them visible

[dev] Why are you looking for workflow name and their availability on store? The link should itself take you to workflow and run the workflow.

[dude] So, say I deleted the email and the alert it was automatic sent out to me and I go back and want to run the workflow manually, how do I do it? As a CV Admin, to me if there is a link on the alert pointing to a form/workflow that gives me the options to click on and allow the converstion to full, why isnt the same form/workflow visible for the admin to be able to run it whenever I want? 

It does not seem very effective to me, to only give the admin the opportunity to fun a full conversion by only allowing giving you one location which is the email as a resource. 

  1. IF DV is disabled, then either don’t show the alert or add context

[dev] The jobs are marked as verification failed by any read operations i.e Synth Full, aux copy or restores. It is not just DV jobs. Even when DV is disabled, if any read operation detects data corruption then we mark those chunks\jobs as data verification failed and those affected subclients will be listed in the alert if the affected job(s) are part of latest cycle. The alert helps to protect the affected subclients by running a new Full on them.

[dude] If this is not only for DV jobs, then the link I shared above needs to be updated and the alerts needs to be better explained to reflect what you are saying. “The data verification job flags the backup jobs with a 'verification failed'  

 

As for the screen shot you sent, I`m aware of that. Like I said, my DV is disable for the storage policy in question, but with the statement that this alert isnt only for DV jobs, it “sort of” make sense. Again, alert is named Data Verification which is one of the main reasons for my previous questions, but now dev mentioned that this isnt only for DV. Very confusing. It is DV but not. 

Userlevel 7
Badge +23

Appreciate your thought out reply!  I’ll continue talking to our dev team.

Userlevel 7
Badge +23

I have some more:

  1. Regarding the alert name: The alert does not mention about DV JOB status. It just says “Data Verification Failure”.  I think you hinted at this earlier in that it kind of makes sense.
  2. Regarding the Workflow details, I’m going to work with the docs team to make this easy to find.  However, dev provided this idea: If user wants to enable the automatic conversion to Full on data verification failure then they can enable the below highlighted option any time. If email is deleted then user will received alert email again within next 24 hours if new Full has not run yet.

Let me know if you have questions/thoughts/concerns.

Reply