Solved

Storage Sizing for VMware workload

  • 27 January 2022
  • 7 replies
  • 575 views

Badge +4

I've joined the community to seek your advice on a subject that I'm struggling to grasp.

Please accept my apologies for the lengthy explanation.


We are currently protecting our virtualized data with Snap backups in Commvault configured to retain 30 days of backups (7 incremental daily backups and 4 weekly full backups) and monthly and yearly backups sent off to Azure cloud for 7-year retention.
We are achieving quick restores with the Snap backup, and we are now planning to introduce a SAN storage system on-premises to manage backup copies of snapshot data.


As part of this, I provided detailed VM data sizing to estimate the storage capacity to purchase.
I provided the total number of VMs to be backed up by adding the application size of the existing data sets, and the NetApp sales team used their sizing tool to estimate the capacity required for the next three years based on the total number of VMs, frequency of backups, retention period for backups, and a 50% compression/dedup saving for our virtual workload.

his was captured in September of last year.
However, there has been a 10 TB increase in VM application size due to the introduction of approximately 15 VMs between September and January, but there has been no significant increase in backed storage usage.
As a result of this increase, the total estimated storage requirement has changed.
Now my management wants to know how the SAN storage required estimates have increased despite the fact that there has been no actual increase in backed storage usage due to the addition of 15 new VMs, and I am at a loss for words to explain that the application size is the amount of application data before compression and deduplication, and that this metric is generally used for estimation and that the BET cannot be used for estimation.


Could you please explain why the application size is used for estimations rather than the backend data written size (BET)?  

Appreciate your help here.

icon

Best answer by Mike Struening RETIRED 8 February 2022, 20:57

View original

If you have a question or comment, please create a topic

7 replies

Userlevel 7
Badge +23

Great thread topic!

There’s definitely a lot to unpack here.  So basically what we are seeing is more data being protected, but nothing additional being utilized on the destination library.

My first question regarding the vms/backups is: how similar is this data compared to the existing backup data/vms?

If the answer is ‘very’ or even close, then you might be seeing an increase in the deupe ratio.  I’ve seen a few environments where the dedupe ratio is ridiculously high (99% or so) so each subsequent backup added almost nothing to the library; however, that also meant that pruning jobs did nothing much because that baseline was needed for each of the jobs.

Can you take a look at the new vm backups and share the application size as well as the data written size?

There could certainly be more going on here (I’m assuming no pruned jobs or jobs that are not actually completing), though this is a good first place to start.

Userlevel 7
Badge +19

Mind you that the application size takes into account the provisioned size of the VMs which doesn't say anything about the actual data that is stored on top of the virtual disks as it includes white space. 

@Nikramak I think this page delivers the information that should result in a clear answer: https://documentation.commvault.com/11.24/expert/30801_size_measures_for_virtual_machines.html

Badge +4

Thank you, @Mike Struening  and thank you, @Onno van den Berg , for your feedback. 

 

I'm noticing some odd behaviour.

Per your advise, when reviewing the backup history to see how many virtual machines have been added over time and how much de-duplication efficiency has been achieved on them, I discovered a discrepancy between the "size of the application" and "backup size" on all of our monthly Full snaps. 
This isn't the case with the weekly or daily snaps as you can see in the screenshot. On our Monthly fulls, the application size appears to be much smaller compared to the other full jobs.

 

I only looked at the monthly full jobs and compared it with the recent full job to conclude that there was a spike in usage. So technically, the application size hasn't changed much in the last five months (even though a few VMs were added, they weren't particularly large). Its just unfortunate that I had already provided Netapp with monthly low app size stats for storage estimation, so the sizing must be redone and the budget recalculated due to this issue. 

 

VSA Subclient A

 

 

VSA Subclient B

Is this something you've seen before?  

 

I've also opened Commvault support ticket 220128-276 to investigate this. 

 

Also to understand better, will you please be able to explain why we size storage based on application size or provisioned capacity and not on data written size? 

Userlevel 7
Badge +19

The answer related to your first question is that Commvault is not aware of the actual size of the data that resides in the snapshot but during a backup copy it will copy over unique blocks to secondary storage. At that moment it know how big it actually is on the disk. 

Calculating using data written is not smart because this figure is the amount of unique data that is actually written to the storage. The value is the result of a lot of factors like deduplication and compression. If you application changes all of a sudden due to an update than it could result in an for backup perspective seen unexpected growth. Using this figure as key measure results in close monitoring on capacity management and a lot of guestimates. 

Badge +4

@Onno van den Berg, thanks for your response.

 

Could you please elaborate on your response to my first question regarding why we are seeing a change in the App size for our monthly full jobs that are picked up for backup copy to the Azure cloud library?
If Commvault is unaware of the actual size of the data contained in a snapshot but obtains this information during a backup copy, am I required to disregard the App size displayed in the Snap copies and instead use the App size from the Snap job that picked up the backup for my storage sizing?

Additionally, is it common for the App size to differ from the total VM backup size?
Because I can see that the Snap copy job that was selected for backup copy has a backup size of 3.08 TB but an application size of 1.05 TB.
This is true for all monthly jobs ( for the rest other snap copy jobs the App size and backup size shows the same size).  Sorry, please let me know if my question is unclear.

 

 

Userlevel 7
Badge +23

I think what @Onno van den Berg mentioned (and I didn’t think of) earlier might be a big factor.

There was an issue discovered within the last year or so where files in the virtual disk were not getting backed up because we were seeing the sector of the disk as ‘deleted’ (due to a mix of vm reporting and our interpretation) and never trying to back it up again.  As a result, we now back up the ENTIRE disk regardless of change, white space and all.

I believe what you are seeing here is a result of how much disk space is “used” vs. how much we are preparing to back up regardless (and then there’s the dedupe factor, etc.

 

Badge +4

Thanks for the clarification @Onno van den Berg and @Mike Struening