Skip to main content

Synthetic Full vs Full Backup


Forum|alt.badge.img+8

Hi,

I was wondering how people are scheduling backups. (for example file system backups)

Do you only run incrementals and Synthetic fulls? Or do you sometimes run a “real” full?

For example, 1 “real” full a month, a daily incremental and a weekly synthetic full.

I’ve always heard never to “only” use incrementals and synthetic fulls, and to slip in a “real” full backup once in a while. What would be the reasoning behind it?

Looking forward to hear your feedback.

9 replies

Forum|alt.badge.img+10

@Jeremy 

The first backup will be a FULL backup always as it will scan all the files and take the backup of that.

 

A full backup contains all the data in the sub-client contents. If a client computer has multiple agents installed, then the subclients of each agent require a full backup in order to secure all of the data on that client. Backups can also be performed at the backup set or instance level, and will apply to all of the subclients within the selected backup set or instance.

An incremental backup contains only data that is new or has changed since the last backup, regardless of the type. On average, incremental backups consume far less media and place less of a burden on resources than full backups.

Synthetic full backups consolidate the data from the latest full backup or synthetic full backup together with any subsequent incremental backups, instead of reading and backing up data directly from the client computer. Since synthetic full backups do not back up data from the client computer, this operation imposes no load on the client computer.

Synthetic FULL will save your disk space.  

Synthetic full backups do not backup data from the client computer directly, but uses list of latest objects from the previous backups to build a new backup image. So, it incorporates the latest full or synthetic full with any incremental backups that followed. This means that it does not read data directly from the client computers and therefore decreases the load on your production environment.

 

Full backup is a base line for a client, or a starting point, To initiate an incremental backup, we need to have a full performed. This backup type contains all the data in the subclient, and backup directly from the client computer.

https://documentation.commvault.com/11.24/expert/11694_synthetic_full_backups.html#advantages-of-synthetic-full-backups-over-full-backups

 

Advantages of Synthetic Full Backups Over Full Backups

Synthetic full backups have the following advantages over full backups:

  • They impose a lighter load on the production environment because they are created on the backup repository.

  • They have the ability to carry forward older or deleted versions of the objects backed up during the previous backup cycles.


Forum|alt.badge.img+8
  • Author
  • Byte
  • 57 replies
  • August 9, 2023

Hello @Navneet Singh 

Thanks for this info! I already knew most of it 🙂. I actually wanted to know if it’s best practice to first, run a “Full” Backup (to create a baseline) and afterwards only incremental and synthetic full backups. Will this pose issues down the line? I’m talking years in the future.

The reason I’m asking this is because I’m having an issue with one of my Filesystem backups where incremental backups take ages to complete, but are only a couple of GB in size, and contain only about 90000 files, which isn’t that much. We’ve ran a “real full” backup to create the baseline months ago, but since then only ran incremental backups and synthetic full backups. The scan phase of the incremental takes about 95% of the total duration of the backup, and 90% of the load is DDB lookups. I’m wondering if the incremental/synthetic full schedule is the reason behind it. We might need to trigger a new real full backup, but I wanted to hear from the community first.


Forum|alt.badge.img+10

@Jeremy 

The first backup should be a FULL backup and then you can run only Synth FULL and incremental.

Its not mandate to have a FULL backup monthly. 


Forum|alt.badge.img+8
  • Byte
  • 60 replies
  • August 9, 2023

Hello @Jeremy,

 

If you create a plan there is no notion of running a ‘regular’ FULL anymore and you only run incrementals and synthetic (dash with deduplication enabled) fulls.

 

This pattern has been the main configuration for years now with Commvault and indeed should impose the least amount of stress on the source system.

 

If your incrementals are running long or unexpectedly long I’d suggest to investigate the system whether the journal of the file system is working as expected. Commvault uses the journal to find the changed files since last protection job, if that fails it will do a CRC OR recursive scan to check on all files to ascertain which ones were changed.

As you can imagine the latter part is a long and annoying process.

 

If you suspect there is an issue or cannot find a reason then the best course is to ask Commvault support to look into it. They have seen lots of similar cases and have a huge collection of previous cases they can look through to find possible reasons.

 

Coming back to a regular FULL or a Synthetic FULL and removing the burden pushed onto client systems there is another reason to not want a regular FULL, but only if you use advanced features like “keep last X versions of files”.

Those advanced features only work correct with synthetic full (carry forward) and are reset on a regular FULL which might not be wanted.

 

Hope this helps.

 

Regards,

Mike


Forum|alt.badge.img
Jeremy wrote:

Hello @Navneet Singh 

Thanks for this info! I already knew most of it 🙂. I actually wanted to know if it’s best practice to first, run a “Full” Backup (to create a baseline) and afterwards only incremental and synthetic full backups. Will this pose issues down the line? I’m talking years in the future.

The reason I’m asking this is because I’m having an issue with one of my Filesystem backups where incremental backups take ages to complete, but are only a couple of GB in size, and contain only about 90000 files, which isn’t that much. We’ve ran a “real full” backup to create the baseline months ago, but since then only ran incremental backups and synthetic full backups. The scan phase of the incremental takes about 95% of the total duration of the backup, and 90% of the load is DDB lookups. I’m wondering if the incremental/synthetic full schedule is the reason behind it. We might need to trigger a new real full backup, but I wanted to hear from the community first.

Jeremy,

I am not sure about “Commvault best practices” but only running incremental and synthetic full backups is “my best practice” with some of my environment.

All of my laptop clients are running incremental and synthetic full backups and have been for over 5 years. I haven’t run into any issues backing up or restoring data on a laptop client unless there is something wrong with the clients file system or network connection. Each laptop only gets one initial full and after that it is a daily incremental and weekly synthetic full.

Not only does this option take hardware resource load off of the client it also works best in situations where the client is roaming such as, a laptop client that may end up at a location with a slow internet connection. Trying to run a full backup of a client that is somewhere with poor internet is not always ideal for speed and sometimes internet data purposes.

I have never tested server clients for use with synthetic full backups so I do not know the performance and reliability on how a single file or bare metal restore would work. I wouldn’t think there would be a problem as long as the whole backup chain is free from any type of corruptions.


Forum|alt.badge.img

Reviving this topic, I would like challenge it a bit more when it comes to cloud storage tiers and the backup operations like synthetic fulls, data verification, and so on.
I still do not agree on having a hot tier for the primary copy, because it's very expensive and if you are in an organization that does not have many restore requests and the DR/Recovery Plan is not yet well structured, you just paid the most for storing your backups.
However, if you change it to a cooler tier, either Cool or Cold (I believe this is more a terminology in Azure), doing synthetic full backups will increase a lot the Data Retrieval costs, which can be as high as simply using the Hot tier to store data.
My question is: the usage of synthetic full is so better than the normal full backup in terms of space savings, deduplication, etc, or in case I have a good network infrastructure that can ensure high throughputs, can I simply use the traditional full and avoid the Data Retrieval costs?
Here, I'm talking mostly about VSA backups, via virtualization client.

Thanks in advance, looking forward to hearing from you.


Forum|alt.badge.img+8
  • Byte
  • 60 replies
  • April 29, 2025

Hello ​@Thiago Marcolino,

 

Not sure where you get the “a lot the Data Retrieval costs” as a synthetic full with deduplication (dash full) only reads metadata from storage to create a synthesised full referencing the same unique blocks (increase secondary count in deduplication tables). The other part of this is index based and should run on your MediaAgents.

 

I do see your remark about costs in a certain way and it is why Commvault has reduced the number of synthetic fulls required to run every X time. It now depends on a few factors it calculates like retention, consolidation need, etc. If you have a weekly selective copy of your data, that will force a synthetic full every 7 or so days to comply to that, but if you do not have such requirements you will see that the synthetic full, IF you follow the default plan layout, does not run a synthetic full that often.

 

Secondly, I think Commvault has stopped recommending hot tiers for this specific reason. The only reason they still recommend this would be if you have daily auxiliary copies (data read) OR frequent restore requirements. But as you said that is not the case I believe the recommendation these days and mentiond as best practice is to use a cool tier.

 

You can read up on it here https://documentation.commvault.com/2024e/expert/cloud_storage_building_block_guide.html.

 

Keep in mind they use 30 days retention as a balance point for hot/cool, but this is just a recommendation. That said, it is based on the fact that in a 30 day or less retention the amount of interactions with cool storage tier will add up to the same costs as having a hot tier.

 

Good luck!


Forum|alt.badge.img

@mikevg thanks for the quicly reply.

I'm really surprised, by not saying disappointed, that the reason for such high data retrieval is something very “stupid”. The primary copy in the Storage Policy that is doing the backups for the workloads in Azure, has the deduplication DISABLED! Therefore, instead of just do the reads of the metadata as you explained, it seems reading the whole backup set created by the previous incrementals. Which means, the whole backup created is being read twice (first when doing the synthetic full in the primary copy, second when running the selective copy to the secondary storage account, and on this copy the dedup is enabled).
I may have to review the details one more time, but this seems to be the root cause here.

Thanks in advance.


Forum|alt.badge.img+8
  • Byte
  • 60 replies
  • April 30, 2025

Hello ​@Thiago Marcolino,

 

That sounds like a very nice find and explains your costs situation. It would be interesting to know how you got to this point.

 

You mention creating a selective copy, may I ask where you are sending that copy? If you use, for example, use ZRS this is redundant in your region, but if you use GRS, this is redundant in your region and another region. In the past Microsoft dictated when this would failover, but you can use customer initiated failover when and if necessary. This might greatly reduce your cost footprint as well.

 

Curious to hear more on your setup and see if we might be able to get it improved further!


Regards,
Mike


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings