How Extended Retention affects DDB Size

Forum|Forum|4 years ago
July 16, 2021
5 replies
464 views

+12

Brian Bruno
Vaulter

Following best practices is important for any aspect of our product, but this is especially important with regard to the Deduplication Database. A well-specced Media Agent that meets and exceeds our hardware recommendations hosting a DDB that is configured properly can achieve incredible feats of space savings and backup performance. Conversely, without proper planning and optimal configuration the performance and storage usage can be negatively impacted.

Since the inception of deduplication in Commvault one of these Best Practices has always been that Extended Retention should not be used on a DDB being used for a Primary Copy. To prevent a DDB from becoming too ‘bloated’, our recommendation has been to segregate long-term retention into its own DDB, leaving only Basic Retention on your Primary Copy. Although it may appear obvious that longer retention means a larger DDB, the true reason for this may not be immediately apparent.

Prior to DDB V5 (or DDB V4 GEN2 depending on who you ask), the Secondary Table Files contain 16 Archive Files (which we will call ‘afiles’ for short). These afiles in many cases will be affiliated with different Job ID’s. In order for any of the files in the Secondary Table to become obsolete and get deleted, ALL 16 ‘afiles’ must be eligible for deletion.

A Primary Copy using only short-term retention will be creating and subsequently deleting Secondary Table files at regular intervals, which keeps the DDB lean and fast. How about if we mix-and-match short term and long term retention? Adding Extended Retention into the scenario, some of those afiles will be kept for a much longer period of time, preventing those secondary files from being deleted. The result is that more and more secondary files will be created as new Primary backups are written, but those files will get deleted very infrequently due to the long-term retention. The result is the DDB will continue to grow in size which will consume expensive SSD disk space, as well as eventually slow down performance.

The above scenario is no longer true in an environment running DDB Version 5, as we’ve changed the way that the Secondary Table is structured. Whereas in older versions each file in the Secondary Table contained 16 afiles (which may correlate to multiple jobs), in DDB V5 1 file = 1afile. This seemingly small change has a massive impact here, and allows for short term and long term retention to co-exist on a single DDB without resulting in unnecessary bloat. As short-term retention jobs meet their retention and age, we have the ability to clean up the corresponding files in the DDB immediately once they are no longer needed.

If you are interested in learning more about the advancements made in DDB V5, I urge you to watch this incredible video created by our Training Department that goes into much greater detail about our latest DDB platform :

https://commvaultondemand.atlassian.net/wiki/spaces/ODLL/pages/501350567/Tech+Talks?preview=/501350567/501022936/DDB%20V4%20Gen%202.mp4#TechTalks-DeduplicationGeneration4Version2

Have an older version 4 DDB and want to leverage these newer features without sealing and starting over? No problem! We’ve created a Workflow that can be used to upgrade your existing DDB :

https://documentation.commvault.com/11.24/expert/134345_performing_upgrade_of_deduplication_database_from_version_4_to_version_5.html

Have some additional questions? Be sure to reach us on Community and we will be glad to help!

+14

Henke
Explorer
Forum|Forum|4 years ago
September 1, 2021

@Brian Bruno Thanks for the useful information.

//Henke

Like

K

Kyle Hebert
Novice
Forum|Forum|3 years ago
August 24, 2022

I’ve perused our CommCell, Command Center, existing reporting, and DDB related reporting available via cloud.commvault.com download & import. I was not able to determine whether any of our DDB’s were still v4 until downloading, importing, & executing a workflow “ConvertDDBToV5”. This workflow converts a standard DDB v4 to a DDB with Garbage collection (V5). After executing this workflow you’ll be prompted to choose an option - Planning or Execution. To review your DDB versions and confirm if any are eligible for an upgrade, select Planning. If you have no V4 DDB’s this workflow will return “There are no Dedupliction Engines requiring upgrade”. If you do have DDB engines eligible for upgrade, these will be identified. You’ll then need to plan a maintenance window and rerun this workflow, choosing Execution.