Is versioning (on s3 compatible bucket/namespace) even relevant for Commvault Backup of VM, SQL, Oracle DB workloads to S3 Compatible Storage.
Hi
I created a backup job for a VM with deduplication enabled at subclient level selected as “on the client”. Enabled Versioning on the bucket configuration. Took FULL backup twice. But it didn’t create versions of the same backup.
Later, I ingested some files inside the VM and retook the FULL backup. Still, No versions created at the bucket level.
I then disabled deduplication at the subclient level, retook the full VM backup, it creates more folders in the bucket everytime I initiate backup job.
Questions:
Does Commvault even leverage versioning feature of the S3 Compatible Storage? If yes, how ? If no, then is it safe to say we can keep it disabled with S3 overwrite enabled?
How are more folders getting created when I disable deduplication and taking multiple FULL backups of the same VM (no modification to the file system or anything)?
Does Commvault overwrite the existing backup chunk folders on the namespace if a backup is taken twice?
Page 1 / 1
Hi Jass Gill,
Commvault does not support S3 Versioning, meaning we don’t integrate with it and as such we don’t care about the S3 versioning in the background. So, S3 Versioning should be disabled as recommended in the Public Cloud Architecture Guide for Amazon Web Services, otherwise Commvault will delete all versions of an S3 object when S3 Versioning is enabled and data reaches an expiry age.
The additional folders for an un-deduped second Full are created since the data is physically sent to the disk library, while deduped data will not be re-written. So, if the unique blocks of the first full backup are already on the disk library, a second deduplicated full backup will copy only new blocks. The two new files you copied to the VM probably contained only unique blocks that were already on the disk library, so there was no new unique block to be backed up during the second deduplicated full backup.
Commvault won’t overwrite any chunk that was written to the disk library, no matter if the backup is with or without deduplication. The only process that modifies or delets chunks on the disk library is the pruning process that kjicks in as part of the Data Aging which happens when data has passed retention.
Hope that answers your questions.
Hi Jass Gill,
Commvault does not support S3 Versioning, meaning we don’t integrate with it and as such we don’t care about the S3 versioning in the background. So, S3 Versioning should be disabled as recommended in the Public Cloud Architecture Guide for Amazon Web Services, otherwise Commvault will delete all versions of an S3 object when S3 Versioning is enabled and data reaches an expiry age.
The additional folders for an un-deduped second Full are created since the data is physically sent to the disk library, while deduped data will not be re-written. So, if the unique blocks of the first full backup are already on the disk library, a second deduplicated full backup will copy only new blocks. The two new files you copied to the VM probably contained only unique blocks that were already on the disk library, so there was no new unique block to be backed up during the second deduplicated full backup.
Commvault won’t overwrite any chunk that was written to the disk library, no matter if the backup is with or without deduplication. The only process that modifies or delets chunks on the disk library is the pruning process that kjicks in as part of the Data Aging which happens when data has passed retention.
Hope that answers your questions.
Thank you, yes, that answers my question. To conclude, it also doesn’t matter whether we keep S3 Overwrite feature (if versioning is disabled on HCP) enabled or disabled. Commvault will re-write new chunks of even the same backup of the VM to the HCP S3 Storage. Correct?
As I said: With Commvault Deduplication enabled, Commvault will only write new chunks if there are new blocks found during backup that were not yet written to the disk library, regardless where on that disk library they are located. These new blocks will be written as new chunks. So, from a Commvault perspective, there is no scenario which causes existing chunks to be overwritten. So, your assumption is only true for backups that are created without Commvault Deduplication since these always write all the blocks.
As I said: With Commvault Deduplication enabled, Commvault will only write new chunks if there are new blocks found during backup that were not yet written to the disk library, regardless where on that disk library they are located. These new blocks will be written as new chunks. So, from a Commvault perspective, there is no scenario which causes existing chunks to be overwritten. So, your assumption is only true for backups that are created without Commvault Deduplication since these always write all the blocks.
gotcha, thank you. Yes that was asked with regards to without deduplication on Commvault.
Hi @MarkusBaumhardt ,
Does all of this hold true for Commvault 11.32 version as well ?
Hi Jass Gill,
yes, this applies to 11.32 as well.
@MarkusBaumhardt
When we enable Commvault Storage WORM Lock feature for the data backup to be successful on our S3 Compatible HCP, HCP-CS ; we have observed that the S3 object lock is mandatory on the bucket.
When we enable S3 object Lock , versioning is always enabled by default. According to the below snippet in the documentation shared above, Lifecycle Policy needs to be enabled when versioning is enabled. What exactly would be Lifecyle Policy required for ? Because we observed our data is getting deleted according to the retention expiry even without lifecycle policy.
Hi Jass Gill,
sorry for the late response, I was on annual leave and just returned.
As we don’t integrate with S3 versioning, we ignore it basically, so we are not able to consider different versions of files during data aging and/or pruning on the storage. We simply delete the data, assuming there is no versioning present on the storage. What exactly happens on the storage when we send the delete request is unknown to us. The advice to use Amazon S3 lifecycle policies with S3 versioning comes from the vendor, but as Commvault is unaware of S3 versioning we cannot say why this is needed and how the behviour on the storage changes. I’d suggest to check with the storage vendor for details.
Hi. New user here, but am familiar with S3. Not with CV backups though.
When object versioning is enabled, and you or CV deletes a file, the file is not deleted. A deletion marker is placed “on top” of the file, to indicate that it has been deleted.
When you, or CV, ask for this file to be returned, S3 will return the latest “version” of the file, in this case, the deletion marker, indicating that the file has been deleted.
It has not been deleted. It just looks like it has. CV will request that the file be deleted, but S3 will place a deletion mark on it. Check your S3 usage, it wont go down.
What the Lifecycle policy request means is that because CV can’t detect file versions, a third party operation has to occur to delete the multiple versions of the file. This might be done by the S3 vendor, or a script that you can write. I wrote a script. Don’t forget to no only delete the file in question, you should also delete the deletion marker as well.
@newCVuser99,
that’s good information, thanks for enlighten us here!
Hi Jass Gill,
sorry for the late response, I was on annual leave and just returned.
As we don’t integrate with S3 versioning, we ignore it basically, so we are not able to consider different versions of files during data aging and/or pruning on the storage. We simply delete the data, assuming there is no versioning present on the storage. What exactly happens on the storage when we send the delete request is unknown to us. The advice to use Amazon S3 lifecycle policies with S3 versioning comes from the vendor, but as Commvault is unaware of S3 versioning we cannot say why this is needed and how the behviour on the storage changes. I’d suggest to check with the storage vendor for details.