Question

Recovery from cloud Storage, read optimization's

  • 5 January 2024
  • 10 replies
  • 240 views

Badge +7

Hello,

Could you say please, if I have Cloud storage in Azure, it is accessed by local MA with local DDB.
Are there any optimization's during direct restore from cloud? I mean, if you have 10 TB application , that consumes 3 TB of cloud storage due to deduplication optimization's, how big amount of data will be read during restore?

Thanks!


10 replies

Userlevel 5
Badge +12

Hello @AndresL 

Thanks for the great question! 


First rule of Commvault to remember is that the DDB is not used at all for restore, only backup. All you need to restore data is the Commserv and the library. This means if you have a server in the cloud and you give it access to read it will be able to perform the restore.

 

When it comes to recovering Deduplicated data you need to look at the application size and that is the amount of data in worst case that will be read. When you see data written being much lower than application size a good way to think about it is that its data is already written under a different job. So its 10TB of app size is in the cloud but only 3TB was written in this job, the other 7 was written in other jobs. 

This is a very simple way to think about it and there are other factors that will come into play like compression and such but when planning a restore you want to think about worst case and if the app size is 10TB that is the amount of data that will be read/written on the restore. It has to come from somewhere :D

Hope this answers the question and helps you get the restore to run faster!

Kind regards

Albert Williams

Badge +7

Hello Albert,

Could you please describe technical part, how Commvault performs restore without DDB?
I mean, when DDB is lost and you performing restore, how Commvault gathers all blocks together required for particular job restore?

Userlevel 5
Badge +14

Hello @AndresL 

The CommServe database keeps track of which volumes\chunks\files are associated to a Backup Job.

The DDB is only a CTree database of hash records. It uses a Primary Table to track unique hash signatures and a Secondary Table to manage references to the Primary Table.

When you submit a restore the CommServe sends the list of required chunks to the MediaAgent to restore from the Cloud Storage.

 

Thank you,
Collin

Badge +7

Hello @AndresL 

The CommServe database keeps track of which volumes\chunks\files are associated to a Backup Job.

The DDB is only a CTree database of hash records. It uses a Primary Table to track unique hash signatures and a Secondary Table to manage references to the Primary Table.

When you submit a restore the CommServe sends the list of required chunks to the MediaAgent to restore from the Cloud Storage.

 

Thank you,
Collin

Hello Collin,

Could you please provide links to Commvault KB that describes technical details of restore process for case when DDB is lost?

Br,
Andrejs

Userlevel 7
Badge +19

I do not think that you will be able to find anything online anymore. Too much of details and was in the past only explained in the training courses. I was able to capture this from the Master course from some years ago. 

 

Userlevel 5
Badge +12

Hello @AndresL 

 

The following is a high level summary that helps understand how the DDB is not required during a restore. 


When we write data into a chunk we will perform one of two things. Either write new data that has been submitted to be written or add an entry in the chunk metadata file and inside there it has a reference to a different chunk that contains the data that was not written in this chunk. 

 

To perform a restore the CS knows what chunks are associated to what backup job and when a restore attempts to run and collect data from a chunk it will also check the metadata file and then follow those leads as well to eventually collect all the required information for the restore. This is why as long as your CS and library are intact, we can restore data. 


I hope this has helped you understand what is going on under the hood. 
It is worth noting that the one line answer to your question still stands. Expect the application size to be read in worst case and plan for that. 

 

Kind regards

Albert Williams

 

Badge +7

Hi all,

Thank you for your replies.
My intention is to understand the behavior of Commvault in case if DDB is lost and data is only in Azure. Main concern is about amount of data will be read, because in azure you have significant cost for ingress internet traffic and also it is time consuming to read all library.
For example: We have secondary copy in Azure library with DDB database on local mediaagent, total library size is 70 TB. We lost primary copy together with DDB and media agent. I need to restore ASAP single job for 10 TB big application. Will it require DDB reconstruction? Will it require whole library read from Azure? How fast I will be able to start restore, will it be possible before DDB reconstruction?

Thank you!

BR,
Andrejs

Userlevel 5
Badge +14

Hello @AndresL 

As previously stated the DDB is not used during restores. it is only used for space saving during backups. If the DDB is in an invalid state you will still be able to restore you data assuming there are no other issues at play. When doing a restore you will need to read the entire application size of the data you are attempting to restore.

I hope not to confuse you here but Deduplication doesn’t prevent you from writing the entire Application size of the backup, it only prevents us from writing duplicate data to storage that would cause bloating and unnecessary space consumption. When we back up deduplicated data we read the data in (usually) 128kb blocks. Then a hash signature is generated for those blocks. We lookup the DDB to see if that hash exists, if it does, we discard the data and add a pointer record. If not, we write the data to disk and add a Primary record. During restores, the CommServe keeps track of the files needed to restore so we do not have to scan the DDB to replay all of the links just to get the data back.

Regarding Reconstruction the simple answer is yes. If the DDB is in an invalid state the software will automatically start Reconstruction jobs until it is recovered or sealed.

 

Thank you,
Collin

Badge +7

Hello,
Thank you for answer.
Regarding DDB reconstruction, normally Commvault should try to recover DDB from backup, before trying to rebuilt it. But I noticed that DDB database backup is not copied to secondary copy in azure, everything is copied except DDB backup jobs, is it by design? 

Br,
Andrejs

Userlevel 5
Badge +14

@AndresL 

Yes by default DDB Backups are not aux copied. They need to be easily available for the reconstruction.

 

Thank you,

Collin

Reply