Skip to main content

I’ve been discussing a topic with a Commvault user of an issue they are having. They do not use Commvault deduplication for their backups as they are writing to an array with deduplication. They are now looking to create a Secondary Copy of their data to replicate to Scality Ring. However, my concern is they are going to run into the same problem they did when they tested Air Gap Protect - the secondary copy inherits the primary copy’s compression settings (not enabled), so the deduplication performance is not where we would anticipate it to be.

If that’s the case, is there any other feature out there that would allow them to write both a non-deduplicated copy (primary) and a deduplicated copy (secondary) with different compression settings? Documentation makes it look like you can change the settings for a Storage Policy Copy, but I believe it is only referencing the Primary Storage Policy Copy as the top level explicitly states the compression settings for Aux Copies adheres to the Primary Copy’s settings.

Is it possible to have this scripted somehow via Workflow? Or even a possibility to have a Secondary Copy created and the Source Copy set to “Primary Snap” would work as well as the source data is being protected via IntelliSnap.

Any thoughts or insight are greatly appreciated.

Hello @BSircy 

Thanks for the great question. With Copy 1 being non-dedup and copy 2 being dedup the main issue you are going to have is you cannot use the “Dash” Feature. Without this feature all of the data will be read and copied over before it being broken down into its signatures and deduplicated. 

In regards to that, if the data was compressed by Commvault before being written ( as is default ) and then the storage Deduplicated it the process should work in reverse. Commvault should not attempt to dedup compressed data except for very specific cases so besides the high read and network usage you will have in this configuration, you should not see extremely low dedup.

Kind regards
Albert Williams


Hi @BSircy ,

 To understand this better, what is the concern with compression for Non-Dedup Primary and Dedup Secondary. Assuming both the copies are enabled with compression, is the concern that it could affect HW dedup on the primary or is the question whether it will affect dedup ratio of the secondary?

Also the compression settings could be modified on the copies. 

The procedure to modify software compression on a storage policy copy depends on the deduplication settings of the copy.


@Albert Williams @Abhishek Narulkar I appreciate your replies and apologies on the delay, I wanted to make sure I had all the updated information prior to moving forward.

The reason I mention the data not being compressed on the secondary copy (and why it is a concern) is that these was the findings by Commvault about a year ago (July 2023) when we were attempting to implement Air Gap Protect into the customer’s environment. The amount of data being replicated to the cloud was much larger than anticipated (2-3x the size if memory recalls) and this was due to the secondary copy not being able to have compression enabled if the primary copy did not use compression, regardless of what the setting shows. If you are able to see case notes, you’ll see some of  this discussed in Case 230622-355.

They use Catalyst with HPE StoreOnce to write their data on the primary side. What they would like to do is utilize Scality Ring in replacement for their tape library, but it looks like we are running into a similar issue. They are currently testing it and the size of the data on Scality is growing at a large rate daily. For their testing they have the following:

  1. Dedicated pool on Scality Ring configured via S3 to Commvault.
  2. The source data is file system data that is being IntelliSnapped, Backup Copied to HPE StoreOnce with Catalyst libraries and then a secondary copy going to the Scality Ring utilizing Commvault deduplication.
  3. The daily change rate on this data is less than 200 GB.
  4. They have Object Lock configured on the Storage Pool w/ 14 day retention, create new DDB every 14 days and WORM lock settings at 28 days.

As you can see below, the data is essentially doubling the source data every day, even though the DCR is less than 200 GB. A new job was not ran on 05/10 so ignore that day. Day 1 is around 17 TB, Day 3 it’s grown to 26 TB, Day 4 40 TB, etc.

 

I know there is a lot of settings to take into account here - compression, encryption, deduplication, both on hardware and software side - but working with Commvault support + development last time when using AGP led to the conclusion that data would not be compressed on the secondary copy since the primary copy was not using software based compression, leading to low deduplication rates and high rates of data written on a daily basis.

Any insight you can provide would be great.


Reply