Best practices around Data validation

  • 1 February 2021
  • 3 replies
  • 333 views

Userlevel 1
Badge +5

Our storage policies are built as follows:

primary (Storeonce)

sync copy 2 (storeonce)

sync copy 3 (storeonce)

copy 4 (tape)

 

Although there is an option to use “validate jobs” for each of the copies, I was wondering if we could use the aux copy job report to catch any issues with the integrity of the data as running a separate validation job would be an additional overhead for the backup infrastructure. One could reduce the amount of data to be validated by defining parameters for which data qualify for validationv OR use only the tape copy for job validation as the tape drives will perform well with reading data as the data will have to be read serially (we do have combine to streams set, wonder if this will affect validating jobs as jobs could be split across multiple media) or use auxcopy reports to by pass validating jobs. Because auxcopies will read all the chunks from the source copy and write them to the target copy and make sure the indexes are also intact. So isnt auxcopies like validating the jobs?

Is anybody using job validation (storage policy copy properties → advanced)

 


3 replies

Userlevel 2
Badge +6

Hi Ashutosh,

 

The tape copy can generally be considered a “validation” of your jobs so depending on whether this is a synchronous copy or a selective copy will determine how much data you have validated. 

 

If the tape copy is a selective copy that is only copying monthly or yearly jobs, then you may still want to validate one of the other storeonce disk copies.

 

When using Commvault deduplication, the additional purpose for validation jobs is that it will mark bad blocks that are detected so that the DDB will not reference these blocks again. If the DDB sees the same blocks in the future from new backups, it will write it down to disk again thus allowing future jobs to start referencing this new block.

 

Since your configuration is purely storeonce for disk copies, I assume you are using storeonce deduplication? In which case there is not much difference between validation and tape aux copy. 

Userlevel 1
Badge +5

@Jordan thanks for your response. 

Also this is just validating the data physically. on a logical level we could leverage features like the virtual lab for VMs. With regards to other data types like exchange, sql, oracle, restoring copies of data and validating whether the data contains what is needed and whether the DB can be accessed and used by the applications on top. Anybody making use of workflows here? 

In Comparison with Ransomware Features by Veeam, I was wondering if something similar is offered by Commvault as well:

Secure erase : Ability to mount backups and scan for viruses with new virus signatures.

Veeam Data Integration API : Whether Data Cube can be used to crack open backups to scan for malware? 

@Damian Andre  Tagging you as well! as you mentioned that work is being done to put in place with regards to Ransomware.  

Userlevel 5
Badge +9

There is a bunch of layers here.   Natively the product is is check source>destination with CRC checks, verify media options, etc..  When we start talking about “will it work”, there are several methods we can leverage.  App validation allows for staging of the VM and custom scripts can be run to verify DB consistency and any other scriptable actions.  Replication options with poweron can verify VM consistencies, while file system or DB options can leverage localized verification for consistency. 

 

Secure erase:  Today we can remove threats from the restore operations, by browsing and doing a delete item.  In this last Virtual Connections, the security/ransomware lab ran you through this process.     The ability to mount backup, scan for viruses can be done via app validation.   We are evolving this as I write this reply and we should have some powerful tooling to take this to the next level relatively soon.   Since this will be part of our platform, we can easily tap into all the downstream capabilities surrounding threats, meaning we can send actionable alerts, trigger restores, pass restore lists, isolate machines in vlabs for forensic investigation, etc...

 

Veeam Data Integration API:  Hang tight on this bullet, we wont need data cube (Its good to see folks talking about DC as I use it all the time!)

 

 

Reply