Solved

Backup Plans vs Storage Policies

  • 27 January 2021
  • 29 replies
  • 1166 views

Userlevel 2
Badge +4

Newbie Alert!

 

Hello Everyone

 

I am new to Commvault I took over an environment which has more than 200 storage policies. These policies were created as a result of plans created using command center. We have more than 2000 VMs and 200+ filesystem clients and Databases. I was wondering how should I approach the cleanup. Is it a good idea to use plans for larger environments, because as a result of single plan there are multiple configurations created automatically. 

 

Thank You

icon

Best answer by MFasulo 27 January 2021, 18:57

Each plan should have a corresponding storage policy in the Java UI so there shouldnt be a 1 to many ratio, unless many plans have been deleted, or you are using region based plans.  

Lets ignore what we see in Java UI for a minute.  

 

A server plan is an encapsulation of; NameStorage Target , Retention, backup frequency, schedule, content to backup (for FS) and options, snapshot options, and database options, all rolled into 1 “container”.  

 

Lets break each aspect down, ill colorize some of the words to draw attention to them as they are either labels in the UI or terms that need focus.  We will start from top to bottom on the create server plan window.   

Plan name is the friendly name of the plan.  I always recommend using a name that is descriptive to talk about what its for, what storage, and maybe frequency.  Something like  Daily_FSVM_Netapp_MCSS_30_90, so I know that its my daily FS/VM plan, that writes to my netapp, secondary to MCSS (Metallic cloud storage service), primary retention is for 30 days and MCSS copy for 90. 

The storage target and retention are part of the “Backup Destination” section in the plan.  This is where you tell the backups were to go and how long to keep them.  You can also enable extended retention rules here if you need to keep certain aspects longer.  The retention also has as role in the schedule, we will get to that in a minute.  

The next part is RPO, this is your “schedule”.  This is the incremental backup frequency in minutes/hours/days/weeks/months/years, along with the start time.  This allows you to control how often and when backups can run.  This can be further isolated to days and times via the “Backup window”.  A “backup window” controls when the backups can run on a plan level.  Similar to a “Blackout Window” which can be applied on the cell, company, server group, level.  Under “RPO”, you can also schedule traditional Full backups, with corresponding options, and windows.   Remember when i mentioned that the retention has a role in the schedule, this is where the automatic synthetic full gets its schedule from.  In order for a cycle to close and for pruning to remove older backups based on the retention the cycle, we must run full or synthetic full (as a cycle is considered Full to Full).  So when you create the plan and set the retention, the synthetic full backup schedule is aligned to that timeframe.  This is how we close cycles in the event you never run traditional fulls.

The collapsed “Folders to backup” is your content to backup for file systems.  This by default is everything, but also allows you to filter files, folders, patters, and control system state options and other components.   Say you wanted to specifically run a backup for a single folder that exists across your entire file server estate.  You can control this from a single plan using this method.  

The backup “snapshot options” section allows you to control recovery points, their retention and enable backup copy operations and runtime.   I would recommend you venture into BOL for more details on that, as I dont think it will pertain to the summary of this post. 

Finally, the database option which allows you to control log backup RPO and you want to use the disk caching feature.   For databases instead of using the regular RPO option, translogs usually have more aggressive protection needs, so this option controls when translogs get protected.   

 

With the understanding of these foundational aspects, when looking to consolidate, start with least restricted needs of machines that need to go to a storage target.  That should be one plan.  As requirements for run time, backup type, or storage targets change, those would require an additional plans.   This should help narrow down the scope of what needs to stay, and what may be overlap.  

 

 

 

 

 

 

View original

29 replies

Userlevel 2
Badge +6

Hi Abdul,

 

May I ask how many plans you have configured in the environment? Generally it is now easier to manage environments with plans for common tasks however there may be some more advanced requirements that may still need manual manipulation of storage policies and copies. 

 

If everything right now is running well, it may be worthwhile to leave as if. If unsure if things are running well, see if your environment reports to cloud.commvault.com at all to check some basic health statistics. 

 

If you are still concerned with health of the environment, it may be worthwhile to engage your Commvault account manager get a quote for a professional health assessment.

Userlevel 2
Badge +4

Hi Jordan 

Thank you for the reply. 
Currently we have around 200 plans configured and more than 50 are not being associated with any clients. That’s where we have to do the cleanup. 
My question is is it a good idea to start configuring plans instead of traditional way of configuring backups using storage policies and scheduling them by manually creating schedules if we have, in our case more than 2000 VMs. We cannot configure a single plan to backup all the VMs so we will create multiple plans and each plan creates separate storage policy. This will eventually create redundant storage policies with similar retention rules. I hope I was able to explain my point. 
 

Thank you 

Userlevel 4
Badge +10

That is a lot of Plans for a VM footprint in the 2000+ range, even if you culled the 50 unused that is still a lot.  Plans, like Storage Policies dictate where data is written to in 1+ places and how long it is held - but that is where the similarities end. The most popular and visible distinction is that Plans combine Subclient and Schedule Policies under the one umbrella.  If you have Active Plans that functionally identical with the exception of the Schedules, then you may can consider consolidating clients under the same Plan and manually decouple them to use classic schedule policies.  That may be counter to the way Commvault Development are heading, but if your goal is to simplify management then you should way up your options because put it this way 10 Storage Policies x 15 Schedule Policies can be 150 Plans - i.e. 125 more things to manage :)

Userlevel 5
Badge +9

Each plan should have a corresponding storage policy in the Java UI so there shouldnt be a 1 to many ratio, unless many plans have been deleted, or you are using region based plans.  

Lets ignore what we see in Java UI for a minute.  

 

A server plan is an encapsulation of; NameStorage Target , Retention, backup frequency, schedule, content to backup (for FS) and options, snapshot options, and database options, all rolled into 1 “container”.  

 

Lets break each aspect down, ill colorize some of the words to draw attention to them as they are either labels in the UI or terms that need focus.  We will start from top to bottom on the create server plan window.   

Plan name is the friendly name of the plan.  I always recommend using a name that is descriptive to talk about what its for, what storage, and maybe frequency.  Something like  Daily_FSVM_Netapp_MCSS_30_90, so I know that its my daily FS/VM plan, that writes to my netapp, secondary to MCSS (Metallic cloud storage service), primary retention is for 30 days and MCSS copy for 90. 

The storage target and retention are part of the “Backup Destination” section in the plan.  This is where you tell the backups were to go and how long to keep them.  You can also enable extended retention rules here if you need to keep certain aspects longer.  The retention also has as role in the schedule, we will get to that in a minute.  

The next part is RPO, this is your “schedule”.  This is the incremental backup frequency in minutes/hours/days/weeks/months/years, along with the start time.  This allows you to control how often and when backups can run.  This can be further isolated to days and times via the “Backup window”.  A “backup window” controls when the backups can run on a plan level.  Similar to a “Blackout Window” which can be applied on the cell, company, server group, level.  Under “RPO”, you can also schedule traditional Full backups, with corresponding options, and windows.   Remember when i mentioned that the retention has a role in the schedule, this is where the automatic synthetic full gets its schedule from.  In order for a cycle to close and for pruning to remove older backups based on the retention the cycle, we must run full or synthetic full (as a cycle is considered Full to Full).  So when you create the plan and set the retention, the synthetic full backup schedule is aligned to that timeframe.  This is how we close cycles in the event you never run traditional fulls.

The collapsed “Folders to backup” is your content to backup for file systems.  This by default is everything, but also allows you to filter files, folders, patters, and control system state options and other components.   Say you wanted to specifically run a backup for a single folder that exists across your entire file server estate.  You can control this from a single plan using this method.  

The backup “snapshot options” section allows you to control recovery points, their retention and enable backup copy operations and runtime.   I would recommend you venture into BOL for more details on that, as I dont think it will pertain to the summary of this post. 

Finally, the database option which allows you to control log backup RPO and you want to use the disk caching feature.   For databases instead of using the regular RPO option, translogs usually have more aggressive protection needs, so this option controls when translogs get protected.   

 

With the understanding of these foundational aspects, when looking to consolidate, start with least restricted needs of machines that need to go to a storage target.  That should be one plan.  As requirements for run time, backup type, or storage targets change, those would require an additional plans.   This should help narrow down the scope of what needs to stay, and what may be overlap.  

 

 

 

 

 

 

Userlevel 5
Badge +9

In addition to the extensive and detailed explication from @MFasulo regarding plan, I would definitely recommend to adhere to plans instead of going back to the old school. Plans are the future!

 

You also see this back in the functionality that has been added to the solution over the last few months as they do not support regular old school storage policies anymore e.g. when configuring it through Command Center.

Userlevel 5
Badge +9

In addition to the extensive and detailed explication from @MFasulo regarding plan, I would definitely recommend to adhere to plans instead of going back to the old school. Plans are the future!

 

You also see this back in the functionality that has been added to the solution over the last few months as they do not support regular old school storage policies anymore e.g. when configuring it through Command Center.

 

Onno I could hug you for this (when the world opens back up, come to NJ, all the scotch you can drink is on me).     Plans will continue to evolve and there will be features that you will only take advantage of when leveraging plans.  As we continue to look into incorporating ML based logic into plans for things like RPO adherence and prioritization, plans becomes an essential part of the future of Command Center.

 

 

Userlevel 2
Badge +4

Thank you guys for the comments, I will stick to Plans. Another point that was mentioned by one of Commvault guys that is worth mentioning here is that we can derive plans out of base plans and this will prevent the redundant storage policies. 

Userlevel 5
Badge +9

In addition to the extensive and detailed explication from @MFasulo regarding plan, I would definitely recommend to adhere to plans instead of going back to the old school. Plans are the future!

 

You also see this back in the functionality that has been added to the solution over the last few months as they do not support regular old school storage policies anymore e.g. when configuring it through Command Center.

 

Onno I could hug you for this (when the world opens back up, come to NJ, all the scotch you can drink is on me).     Plans will continue to evolve and there will be features that you will only take advantage of when leveraging plans.  As we continue to look into incorporating ML based logic into plans for things like RPO adherence and prioritization, plans becomes an essential part of the future of Command Center.

 

 

Deal ;-) I'm a big fan of the idea behind it, because the concept is easier to understand for a larger audience and everyone in the field is implementing the same thinking so it becomes are more commonly used "standard". I do however hope that at some point in time some of the recent changes are reworked because to me the idea behind plans it is to pick the data, algorithms and use compute power to calculate best run time for the jobs taking into account:

  • Run time
  • Last job result
  • RPO
  • Performance impact on client computer when using agent level backup to automatically start throttling resource usage to make sure application performance remains acceptable. If RPO is impacted it will send alert to inform admin to increase system resources.
  • Run backup related process with low system process priority but run restore related processed with normal priority
  • RTO (stream calculation and or possible chopping of the data into multiple smaller sets)  
  • Adhere to blackout window but to show alert when blackout window is blocking the RPO from being met. 

To summarize…… AUTOMAGIC PLAN :-)

 

PS:I still need to write you a mail regarding another topic. Hope i can make some time for it to outline my ideas. 

Userlevel 2
Badge +4

Hi Guys.

Another question that popped up in my mind related to Plans is, can we have a single Plan that can backup, lets say 200 VMs scheduled to trigger at 5PM or we should split this into 4 Plans with 50 VMs triggering backups at different times, lets say 5PM, 9PM, 1AM, 5AM. How should we approach this.

 

Thanks

Userlevel 4
Badge +10

@Abdul WajidHaving 200 or many more VM’s within a single Plan is perfectly fine.  The number of workers that will read the backups is set in the VM Group, so if you have 200 VM’s - you may want to have make sure you have around 20 Data Readers. The more you can pump under a Plan the better, but know your data and the customers that you are protecting.  The smarts in a backup product doesn’t run the business, you align the backup product to the business and then provide guidance. :)

So yes make sure you ask the VM Owners when they need backups to run at specific time.  For instance I only recently proposed a single Server Plan for a customer to protect their VM’s but when we dug deeper into what was being protected that was not possible because many of underlying servers were generating database dumps for warehousing and the schedules had to be staggered because they wanted to be able to recover those overnight dumps that were cleared daily.

Userlevel 2
Badge +4

@Anthony.Hodges  Thank you. That clears my doubt. Currently we have it this way, multiple plans for multiple groups triggering at different times. The reason behind this was that triggering so many VMs at the same time can cause server overload and can possibly break things. But with Plans and VM Groups that should not be the case. We can I believe control the number of streams that will run at a given time and group together as much VMs we can.

Userlevel 4
Badge +10

@Anthony.Hodges  Thank you. That clears my doubt. Currently we have it this way, multiple plans for multiple groups triggering at different times. The reason behind this was that triggering so many VMs at the same time can cause server overload and can possibly break things. But with Plans and VM Groups that should not be the case. We can I believe control the number of streams that will run at a given time and group together as much VMs we can.

@Abdul Wajid Classic Commvault Hypervisor subclients also let you specify the number of readers, so Command Center VM Groups that are configured with Plans are a just continuation of that logic.  The rule of thumb I typically go for is for every VM in a Group I go for about 10 readers, so tune as necessary.  So if your overnight backups start at 9PM, you may have some incrementals that start a few hours later and for most situations that is perfectly fine.

Badge

Hi There.

 

My understanding is that:

  • Using base plan will avoid having multiple SP. A plan derived from a base plan used the same SP
  • Commvault uses some kind of AI to make sure backups are taken within the backup window and should not run all jobs simultaneously.
  • We can still overwrite start time/number of readers at Subclient/VM group level if needed.

Is that true ?

Thanks

 

Userlevel 5
Badge +9

In addition to the extensive and detailed explication from @MFasulo regarding plan, I would definitely recommend to adhere to plans instead of going back to the old school. Plans are the future!

 

You also see this back in the functionality that has been added to the solution over the last few months as they do not support regular old school storage policies anymore e.g. when configuring it through Command Center.

 

Onno I could hug you for this (when the world opens back up, come to NJ, all the scotch you can drink is on me).     Plans will continue to evolve and there will be features that you will only take advantage of when leveraging plans.  As we continue to look into incorporating ML based logic into plans for things like RPO adherence and prioritization, plans becomes an essential part of the future of Command Center.

 

 

Deal ;-) I'm a big fan of the idea behind it, because the concept is easier to understand for a larger audience and everyone in the field is implementing the same thinking so it becomes are more commonly used "standard". I do however hope that at some point in time some of the recent changes are reworked because to me the idea behind plans it is to pick the data, algorithms and use compute power to calculate best run time for the jobs taking into account:

  • Run time
  • Last job result
  • RPO
  • Performance impact on client computer when using agent level backup to automatically start throttling resource usage to make sure application performance remains acceptable. If RPO is impacted it will send alert to inform admin to increase system resources.
  • Run backup related process with low system process priority but run restore related processed with normal priority
  • RTO (stream calculation and or possible chopping of the data into multiple smaller sets)  
  • Adhere to blackout window but to show alert when blackout window is blocking the RPO from being met. 

To summarize…… AUTOMAGIC PLAN :-)

 

PS:I still need to write you a mail regarding another topic. Hope i can make some time for it to outline my ideas. 

We’re on the same page.   We have the data and runtimes and rates of changes, its not too far fetched for us to do this.     My team expanded recently to include cloud/virtualization (and service provider) so if your ideas are in the realm, I can scale a little better than before!

Userlevel 5
Badge +9

 The smarts in a backup product doesn’t run the business, you align the backup product to the business and then provide guidance. :)

You got it!   This is where flexibility and intelligence of the product/platform come in.   

 

 

So yes make sure you ask the VM Owners when they need backups to run at specific time.  For instance I only recently proposed a single Server Plan for a customer to protect their VM’s but when we dug deeper into what was being protected that was not possible because many of underlying servers were generating database dumps for warehousing and the schedules had to be staggered because they wanted to be able to recover those overnight dumps that were cleared daily.

 

There are other ways to accomplish this besides multiple plans:

You can leverage servers groups and modify the “Job Start Time” (depending on your version).  So in the image below, I can add all my database servers that dump the DB local after 8 PM, and I know they are completed by 11 PM, so I can set the start time for all the machines in that server group to start at 11PM

 

 

You could blackout window the server group too to accomplish the same thing (could be used in older versions)

These settings also exist on VM groups, so these controls can also persist to your virtualization protection.

IMO, id rather have less plans, and use other methods to manipulate runtimes, because of the sheer fact its easier.   Creating server groups and adding start times, is a very simple low impact process, while associating all clients, adding new plans, and adding clients is a bit more involved.  

 

 

 

 

 

Userlevel 4
Badge +10

Thinks @MFasulo great work once again. Sometimes when you are in front of a grumpy customer who is watching and questioning your every click - it's easy to miss. Cheers

Userlevel 5
Badge +9

Hi There.

 

My understanding is that:

  • Using base plan will avoid having multiple SP. A plan derived from a base plan used the same SP
  • Commvault uses some kind of AI to make sure backups are taken within the backup window and should not run all jobs simultaneously.
  • We can still overwrite start time/number of readers at Subclient/VM group level if needed.

Is that true ?

Thanks

 

Great questions.   

#1  It depends. Remember what a plan encapsulates, so depending on what gets manipulated on the derived plan will determine what you get as an end result.   If you just derive a plan and change nothing: no net new objects are created, everything just points to the original configuration, all you have is a derived plan with a new name.  

If you override the storage components, you will get a net new SP.  

If you override the RPO, you will get a new set of schedule policies in the backend.  

If you override “folder to backup” you will get new subclient policies.

 

#2  Truth!  We leverage a combination of strike count, priority,  and time series based ML to determine completion times, to determine who goes first.  

#3 Also true, and thats why bigger buckets of objects help manage an environment.  But just like #2, we have a dynamic tiering approach for VM backup distribution, access node assignment, and stream allocation.   Dynamic VM backup distribution helps to avoid hotspots, especially on storage.  Thats why even with a big bucket VM group, we will continually (not just at the beginning of the job) load balance and dispatch across the infrastructure.  There is also a multi-faceted assessment of choosing access nodes assignment that uses proximity to host, storage, vm networks, and even subnet to find optimal paths.  We also dynamically assign/reassign streams to the VM backup, so regardless if a VM is single disk or multi, freed up streams can be reallocated to speed up the protection operations.  

And for the folks who want the ultimate control over what is happening, you could always hardcode access nodes, create segmented VM groups, and control readers and streams as necessary. 

 

 

 

Userlevel 5
Badge +11

I read "plans are the future". Ok,fine,I agree,but what about differential backups? For rto reasons we take differential backups for some database types, but with the plans this isn't possible as far as I know.

 

Userlevel 5
Badge +9

I read "plans are the future". Ok,fine,I agree,but what about differential backups? For rto reasons we take differential backups for some database types, but with the plans this isn't possible as far as I know.

 

The plans RPO will run diffs for SQL.  What database type you need to run diffs for?

 

Badge +1

Adding to this thread….

 

Let’s say a customer has been using old school policies since the beginning, has 5000+ VMs and at least 70-80 storage policies in place. Now such customer cannot take advantage of features available only from Command Center - e.g. VM snapshot indexing for Azure.

 

Does Commvault plan to introduce some kind of workflow\upgrade process that would automatically convert old school storage policies \ schedule policies etc into plans? I think that would be very useful for customers like me who find it difficult to redesign everything from scratch (e.g. new policy equals a need to make new full baseline backup, which in cloud for 5PB of data costs A LOT) :)

Userlevel 2
Badge +4

Adding to this thread….

 

Let’s say a customer has been using old school policies since the beginning, has 5000+ VMs and at least 70-80 storage policies in place. Now such customer cannot take advantage of features available only from Command Center - e.g. VM snapshot indexing for Azure.

 

Does Commvault plan to introduce some kind of workflow\upgrade process that would automatically convert old school storage policies \ schedule policies etc into plans? I think that would be very useful for customers like me who find it difficult to redesign everything from scratch (e.g. new policy equals a need to make new full baseline backup, which in cloud for 5PB of data costs A LOT) :)

I believe there is one app in Commvault store that does the job. Converts the old school policies to plan. 
 

https://documentation.commvault.com/commvault/v11/article?p=130123.htm

Userlevel 5
Badge +9

Adding to this thread….

 

Let’s say a customer has been using old school policies since the beginning, has 5000+ VMs and at least 70-80 storage policies in place. Now such customer cannot take advantage of features available only from Command Center - e.g. VM snapshot indexing for Azure.

 

Does Commvault plan to introduce some kind of workflow\upgrade process that would automatically convert old school storage policies \ schedule policies etc into plans? I think that would be very useful for customers like me who find it difficult to redesign everything from scratch (e.g. new policy equals a need to make new full baseline backup, which in cloud for 5PB of data costs A LOT) :)

As Abdul mentioned there is a tool that will handle the conversion, so “legendary” customers can take advantage of the new constructs and methods.  The app does take some time to run, but outside of the handful of edge cases, the process should be pretty streamlined.  Remember, plans are a logical container, so no need to worry about rebaselines and other cost prohibitive activities.  

 

 

Badge +5

In addition to the extensive and detailed explication from @MFasulo regarding plan, I would definitely recommend to adhere to plans instead of going back to the old school. Plans are the future!

 

You also see this back in the functionality that has been added to the solution over the last few months as they do not support regular old school storage policies anymore e.g. when configuring it through Command Center.

 

Onno I could hug you for this (when the world opens back up, come to NJ, all the scotch you can drink is on me).     Plans will continue to evolve and there will be features that you will only take advantage of when leveraging plans.  As we continue to look into incorporating ML based logic into plans for things like RPO adherence and prioritization, plans becomes an essential part of the future of Command Center.

 

 

Deal ;-) I'm a big fan of the idea behind it, because the concept is easier to understand for a larger audience and everyone in the field is implementing the same thinking so it becomes are more commonly used "standard". I do however hope that at some point in time some of the recent changes are reworked because to me the idea behind plans it is to pick the data, algorithms and use compute power to calculate best run time for the jobs taking into account:

  • Run time
  • Last job result
  • RPO
  • Performance impact on client computer when using agent level backup to automatically start throttling resource usage to make sure application performance remains acceptable. If RPO is impacted it will send alert to inform admin to increase system resources.
  • Run backup related process with low system process priority but run restore related processed with normal priority
  • RTO (stream calculation and or possible chopping of the data into multiple smaller sets)  
  • Adhere to blackout window but to show alert when blackout window is blocking the RPO from being met. 

To summarize…… AUTOMAGIC PLAN :-)

 

PS:I still need to write you a mail regarding another topic. Hope i can make some time for it to outline my ideas. 

Hi Onna,

 

It is great catch for me about plans. As of now I though that the plans are just combining Storage Policy/Schedule Policy/ Subclient policy.

 

But, today I got some great magic behind it. Can you share any resource to more details about full advantage of Plans.

 

Thanks,

Mani

Userlevel 5
Badge +9

@Manikandn → https://documentation.commvault.com/11.22/essential/96314_configurations_for_rpo.html

Userlevel 3
Badge +7

Just marking for later read, as I probaly need to get up to speed on this.

There is no way of marking a topic for later read in here, is there?

//Henke

Reply