Skip to main content

Question: “Why” is it not recommended to use this setting if retention is above 90 days or ‘extended retention” set?  Would it not work if it was 120 days, or “extended retention” as set to 120 days and you just set it to 365?  What is the danger if retention is higher than 90 days?

 

Background:

In this document: Deduplication FAQ (commvault.com) there is a statement made for a setting in fDDB] → properties → Settings called “Do not Deduplicate against objects older than n day(s)”

Statement: “Do not Deduplicate against objects older than n day(s) option to 4 times of the retention days when the retention is below 90 days on the storage policy copy. If the retention is above 90 days or the extended retention is set on the storage policy copy, then do not use this option”

 

Also: It appears this setting can probably go up to 365 days (its “365” in the greyed out settings if its not checked to set this setting).

Also: I have some old (read only) mount points that have “referenced data” on them that’s not cleaning up (after about a year of waiting), and I saw this setting and wondered what would happen if I turned this setting on and set it to 365 days.  I was hoping the referenced data on the old mount points would be deleted for good and I wouldn’t have to seal the DDB’s.

@tigger2 

If the above settings are enabled the data will not be deduplicated and reference creation to secondary block would be minimum.

Allowing the jobs are age as soon as the retentions are met.


Hello @tigger2,

 

This is a very interesting question and nice of you to bring it up!

 

We have this option enabled on all our deduplication databases or at least the ones where the storage does not allow “space reclamation after hole drilling”. To my knowledge only NTFS to this date fully allows/supports this.

Our base retention is 60 or 90 days and we keep this option on 2 or 3 times this time period (120 - 180 days).

 

When we started using this option there was no other way to free up blocks on storage if they kept references forever (like a certain windows dll ;)). Commvault has since introduce the space reclamation option on deduplication databases. This reads the storage block, shrinks it to size and uploads the new file. Essentially their own implementation of space reclamation after hole drilling. You can set the aggressiveness of this action based on amount of perceived free space (20%, 40%, 60%, etc). We tested this and got space back but because we use your option it was limited to 5-10% even on most aggressive option. Also keep in mind this feature will cost money when you do it on storage types that charge read actions and transport costs, like AWS S3, Azure BLOB, etc and these charges can add up.

 

When we started using the option “do not deduplicate” with the above settings we noticed a general overhead on our storage of 20 to 30% depending on change rate over time which we accept and logically expect. We run the “space reclamation” action only on storage we own ourselves like local attached or self hosted cloud storage, etc.

 

That is as far as my knowledge/background goes. If your storage supports space reclamation after hole drilling then I would only enable this option if you do not want to carry forward a certain unique block forever, otherwise there is no need in my book.

 

As to why Commvault discourages the option I have no idea. We heavily use it and are very happy with this option.


Hello @tigger2 

We recommend not to use this option if the retention is above 90 days or the extended retention is set on the storage policy copy because if the retention is long and\or extended retention is set, the blocks are expected to be kept longer. Normally we don’t prune a block unless all subsequent jobs referencing that block have aged, even if the backup job that wrote the block has aged. This is because the block is part of what is called Baseline data (the unique data written to storage which deduplication references to save space on subsequent backups). 
This setting prevents bloating from blocks “sticking around forever” by allowing old objects to age and then we write new blocks on subsequent backups. If you are using longer or extended retention bloating is to be expected and this setting would prune blocks and re-write DDB Baseline unnecessarily.

 

Thank you,
Collin


Reply