Skip to main content

Hello, 

I am looking for the best solution for immutable copy of data built on Commvault in Azure. We have some configuration, but I would like to see what Commvault Community can propose. I am looking for few approaches: 

  1. First copy of data is in local. Azure copy is working as DR site. Dedicated storage for Immutable in Azure per site (locally). 
  2. First copy of data is in local. Azure copy is working as DR site. Shared storage for Immutable in Azure per site (locally).

In that both approach I would like to know about the cost and how to estimate decrease usage storage between dedicated - shared storage ? 

What will be the best storage configuration for that solution Deduplicated or Not-deduplicated?  Retention for Immutable is 14 days and 2 cycles. 

Storage usage for Immutable solution (Hot or Cool)?

And I think the most important how to calculate cost for those solutions if I have only local backup solution in current state. 

Regards, 

Michal 

Hello Michal,

 

Your question is pretty advanced and touches solutions in Commvault that are in constant flux/improvements and tend to change every 6 months a bit.

 

To answer your question regarding Hot or Cool, that does not have a huge impact on immutability, but more a question on amount of data reads (restores, auxcopies, data verification, space reclamation). Commvault reduces API requests if you select Cool, but if you then use lots of data read operations it might not outweigh the Hot class.

 

If I read your question correctly regarding dedicated storage or shared storage is that you would like to know if you can use the same storage container for immutable and non-immutable storage and/or potentially other storage targets?

In my book you create separate storage accounts per use-case and in this case per DDB. This is also in line with how Command Center envisions you use storage.

 

I am not going to talk about deduplication savings as that is highly dependant on your environment and/or change rate in applications and whether there are encrypted databases, etc. So please use the existing environment/best practices to calculate storage usage for deduplication savings.

As a rule of thumb I calculate 30% object storage overhead as space reclamation does not work on object storage. Commvault has a space reclamation schedule that can read a given piece of data dehydrate it and upload it again. Keep in mind that this also incurs egress network bandwidth and API calls. Using the space reclamation schedule the 30% overhead can be reduced to ~10% depending on which value you use on aggressiveness, however I believe the cost overhead from the action outweighs the storage costs. I would highly suggest to use the “Do not deduplicate against objects older than” option here (using a 2x or 3x retention value).

NOTE: We are in discussion with product management on this topic and are hopefull for a improvement in a future version that might change above drastically.

 

This is all for data storage and cost calculations without immutability enabled…

 

The biggest drawback when you currently enable immutability in Commvault is that it forces the deduplication database to be sealed. This burden is exponentially bigger if you have a (very) short retention, like 14 days. In this scenario where you enable immutability any sealed DDB needs to have no job referencing primary blocks before it will be removed (macro pruned). So you will have 2 or 3 DDB’s in play hosting 14 days of retention at any given time, so in effect you get around 42 days of retention OR in the scenario you seal every 7 days you get 21 days of retention sort of.

NOTE: If there is any dangling job in the DDB, it will not prune and continue to use the complete storage footprint until resolved!

 

So whatever you calculate on costs, do that times 3 to incorporate any immutability options AND incorporate some man hours (FTE) to ensure people follow up on DDB not pruning as that will likely be needed.

 

I know above paints a rather expensive and bad view of immutability and please know this has been brought to the attention of Commvault product management and there are discussions and improvements planned, but they require time and will likely not appear anytime soon.


Hello, 

Thanks for the answer.

About the case of the storage in Azure. I think better idea is Hot one I checked the cost related to store the data and compare both. When DDB verification process started  cost on Cold Storage increased dramatically (Read operation from storage account). 

About Immutable shared or dedicated storage. I think how much data can be reduced if we use shared storage? Moreover we have 20 sites where the data will be copied on Immutable storage to Azure. 

I know that I need to use new storage account for Commvault Immutable solution, because some configuration can be set only during the creation storage account. 

So as I understand You are not recommended deduplication for Immutable storage solution, yes ? From that sentence: 

“ I would highly suggest to use the “Do not deduplicate against objects older than” option here (using a 2x or 3x retention value).”

Regards, 

Michal


Hello @Michal128,

 

You are very welcome.

 

Regarding using deduplication with immutability is a question I cannot answer for you. The impact of immutability caused by sealing the DDB increases dramatically in lower retention configurations (7 days, 14 days, etc). This is because you need to send a complete baseline every time the DDB is sealed, which could lead to missed SLA in keeping your copy to the cloud in sync within reasonable time frames.

 

If you decide not to use deduplication then whether you use 1 storage account or 20 (1 per remote location) does not matter and is a personal preference. If you however do choose to use deduplication then I’d suggest to use 1 storage account and 1 DDB (consider using horizontal scaling DDB in this case) as it will also deduplicate data across your 20 remote locations and lead to a reduced footprint of stored data.

 

I’m sorry I cannot provide better answer here as the impact decisions you make impact your SLA and/or cost picture.

 

Regards,
Mike


Hello @mikevg 

Thanks for Your answer. About the sealing DDB it will be every 28 days that configuration will be provided by Commvault Workflow, where the sealing DDB is mate 2 times more than retention so in that configuration total retention data should be 42 days maximum. 

Another question which I have is: how much of data should be expected (size of storage) in Immutable storage solution (storage without Immutable multiplexed by 3). 

Regards, 

Michal  


Hello @Michal128,

 

I do not know off-hand what the sealing interval will be when you configure it these days, in the past it was the amount of days of your retention which would be 14 days in your use-case.

Commvault is constantly trying to improve on this regard, so it could be this was indeed changed.

 

Regarding your expected data, this is a hard question due to all the points I raised above and the fact that a sealed DDB can only be removed and therefor pruning to happen once all jobs (references) are gone.

 

Commvault says you should consider 3 times the amount of storage used when enabling immutability on top of your calculated backend size.

 

Mike


Reply