Solved

Cloud Libraries and AWS Combined Storage Tiers

  • 25 February 2021
  • 2 replies
  • 1019 views

Badge +1

Hey guys,

I’m currently using S3 IA for my cloud libraries (dedupe used) and  looking to reduce costs. The combined storage tiers look promising, in particularly Intelligent Tearing/Glacier. Has anyone got any experience in using this, and can offer some insight into its suitability? 

Cheers,

Steve

icon

Best answer by Damian Andre 25 February 2021, 03:27

View original

2 replies

Userlevel 7
Badge +23

Hey @steveg,

Great question, and I highly recommend you take a look at a few chapters of the AWS architecture guide located here.

https://documentation.commvault.com/commvault/v11_sp20/others/pdf/public-cloud-architecture-guide-for-amazon-web-services11-20.pdf

Check out page 16 for “Storing Data Efficiently”. - here is one extract relevant to your question.

 

A note on S3 Intelligent-Tiering: You will note that the S3 Intelligent-Tiering storage class is not represented, this is due to the fact that Intelligent Tiering makes data placement decisions based on access frequency. In Commvault, data is split into warm indexing data that allows for locating data chunks distributed in large data vaults, and cool/cold stored data. Commvault does not recommend the use of S3 Intelligent-Tiering, but instead advocates the use of Commvault combined storage classes (more below) to ensure you can efficiently locate and surgically recall data in minimal timeframes. The benefit is using Commvault combined storage tiers, means that small, warm indexes are kept in low-latency storage classes, available within millisecond first byte latency, meaning a surgical restore for cold/cool data occurs within minimal delay, while leveraging the low-cost of cooler storage classes

 

There are always penalties with the cheaper cost storage, and the table on page 17 tells part of the story. Additionally, data in glacier cannot qualify for granular pruning since we do not have that level of access to it, so generally, you have to scope out the retention and then seal the deduplication database periodically matched to the retention to allow the data to be deleted. Without doing that, new jobs rely on the old ones, and data can’t be aged. 

It may seem daunting at first, but if you get the parameters right you can save a significant chunk of change. You just need to consider the costs of recovery, time to recover, and retention of the data. The document I referenced above is the perfect guide to help you plan it out.

Badge +1

Hi Damian,

Great doc, thanks very much. I will be using that to plan this 

Thanks again

Steve

Reply