We are just beginning to migrate some local workloads to Azure. As part of this we are creating some new storage policies that will backup directly to some Azure Commvault media agents. Looking at the CommVault design guide, it recommends using a 512k dedupe size. This is to reduce the gets and puts to keep costs down.
I have a few worries/concerns with this. One using 512k vs the old default of 128k will reduce my deduplication ratio and use more space. This is also cost.
Second, I would like to AUX copy the Azure copy of the backup to my On-premise install. My On-premise DDBs are all 128k. If I use 512k in Azure and want to copy the data to on-premise, I will have to create new a DDB using 512k. This will again consume more space along with sending more data across our ExpressRoute. This again will be more cost
Does anyone have any way/ideas on how to estimate the cost delta of using 512k vs 128k, as it relates to the initial storage, secondary storage, and egress cost for secondary copy?
I am thinking about using 128k in Azure to match our On-Premis.
Azure 512k dedupe recommendation?
Best answer by Damian Andre
The larger block size isn't just the get/put operation cost which is usually minimal, but its performance. Retrieving larger blocks improves read performance (i.e restore) greatly.
Per documentation here are the recommendations - you want to match the source copy size with the destination copy size, so in this case you are probably better off with 128k for copy performance and efficiency, with a slight ding on read performance from the secondary copy.
We recommend you to use default block size of 128 KB for disk storage and 512 KB for cloud storage. If cloud storage is used for secondary copies (that use disk copies as source), then we recommend you to use same block size as the source copy.
For a complete cloud environment where all copies use cloud storage, we recommend to use default block size of 512 KB.
For a mixed environment where some workloads use cloud storage for both primary and secondary copies and other workloads use primary and secondary disk storage to cloud, we recommend you to create separate storage pools with different block size, as follows:
512 KB (default) for complete cloud workloads
128 KB for secondary copies that use disk copies as source
Reply
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.