I have a HyperScale with 6 node cluster in on-prem for primary copy. For Secondary copy I need to move the data to Cloud (Archive Storage).
My doubt is, when I am creating the Cloud Storage Pool,
Do I select the existing (On-Prem) de-dup path (/ws/ddb/P_1/Copy/_21/Files/31) ( or)
Do I need to create a dedicated the dedup path in the on-prem MA
If I need to option 2, then what is the recommended dedup partition value and reason behind that?
And also share the your best practice for hybrid data protection, if any.
Thanks,
Manikandan
Page 1 / 1
Just to prevent high load on your primary tier I would personally use a different MA for cloud Dedup than your primary HyperScale. Either on-prem or in the cloud and host the DDB there.
What do you mean with the recommended dedup partition value? Do you mean the partition block size or any specifics in mind?
With HyperScale the DDB/Index drive configuration allows for hosting a Cloud DDB locally on the HyperScale Nodes without needing additional infrastructure. However there are some limitations to be aware of:
The Cloud copy should not exceed the size of the HyperScale Storage Pool.
Not supported on HyperScale HS1300 Appliances hosting a Commserve VM
When creating the new DDB partition, a new folder should be created for this DDB on each node at the top level folder (/ws/ddb):
/ws/ddb/<ddb_folder>/<partition_folder>
for example: /ws/ddb/could_ddb/P_1
Manikandan,
We use the same Media Agent to house the On-Prem and Cloud copy DDB all the time. Just need to review sizing to make sure you have the resources but a 6 node Hyperscale cluster should have plenty of headroom to support the deduplication of both copies. I believe the “partition value” is the block size for the cloud. Since some cloud providers charge more for “Transactions” the larger page size can cut costs. The Puts aren’t so much the issue but Read performance will improve with larger page size Deletes can put a huge load and cost on the system if page size is too small. I think CV is recommending a 512 KB size.
Manikandan,
We use the same Media Agent to house the On-Prem and Cloud copy DDB all the time. Just need to review sizing to make sure you have the resources but a 6 node Hyperscale cluster should have plenty of headroom to support the deduplication of both copies. I believe the “partition value” is the block size for the cloud. Since some cloud providers charge more for “Transactions” the larger page size can cut costs. The Puts aren’t so much the issue but Read performance will improve with larger page size Deletes can put a huge load and cost on the system if page size is too small. I think CV is recommending a 512 KB size.
Thanks for your valuable response. I am really surprised the way of getting response from great Techies.
From the conservation, i understood that I need to create a subfolder under current DDB directory for cloud in each node.
Current local DDB path : /ws/ddb/P_1
proposed Clod DDB path : /ws/ddb/Cloud/P_1
Correct me If I am wrong.
I got that, my phrase make bit confuse here . I am asking recommended partition DDB count.
In each MA, what is recommended count of partition DDB for cloud whether it is one or two.
Thanks,
Mani
Manikandan,
We use the same Media Agent to house the On-Prem and Cloud copy DDB all the time. Just need to review sizing to make sure you have the resources but a 6 node Hyperscale cluster should have plenty of headroom to support the deduplication of both copies. I believe the “partition value” is the block size for the cloud. Since some cloud providers charge more for “Transactions” the larger page size can cut costs. The Puts aren’t so much the issue but Read performance will improve with larger page size Deletes can put a huge load and cost on the system if page size is too small. I think CV is recommending a 512 KB size.
This is sub-questions, arose when going through the above documentation. As of now, we are moving the monthly/annual backup to cloud as “tape replacement scenario” with deduplication.
But the below statement, telling that dedup won’t help much for long term retention. Can anyone share more details about the below scenario in terms, performance, cost and etc..