Solved

Backing up data again on Linux file server after temp move

  • 16 February 2024
  • 4 replies
  • 50 views

Badge +3

The goal on our Linux file server is to replace the current /project1/ directory with new /project1/ directory, all the same data, but the data stored on different back-end storage. All of this data is backed up by Commvault with the Linux File Server Agent but not doing block level backups.

However, when doing a test, we discovered that Commvault will back up the data again after its moved back to the new /project1/test/ directory, even though the data is exactly the same (same file sizes, modified dates, etc).

I populated a folder on the file server /project1/test and created a Commvault sub-client for only that directory. I ran a manual full backup, which backed up 344MB of data.

Then I ran the following commands (as root on the linux file server)

rsync -avHxAPS /project1/test/ ~user/temp/

Removed /project1/test/ using:

rm -fr /project1/test/

Synced the data back from /temp/

rsync -avHxAPS ~user/temp/ /project1/test/

Reinitialize the xfs project quota:

xfs_quota -x -c 'project -s test' /project1  #

All the mtimes/file sizes, etc. on the replacement data are the same as the original. I then ran an incremental job on the same sub-client. The job also backed up 344MB of data. I then ran another incremental job, and it backed up 0 bytes.

We had hoped the first incremental would back up 0 bytes as well because even though the data was temporarily housed in a temp location, it was rsynced from/to the same directory and nothing had changed.

Is this expected behavior? Any suggestions on how to test this differently before we do this with 200TB of data? Or is there a way we can finesse this so Commvault doesn’t think it needs to back this data up again?

icon

Best answer by SparshGupta 19 February 2024, 08:59

View original

4 replies

Userlevel 4
Badge +10

Hi @rebajs 

 

When rsync command is executed to copy folder contents, the ctime (change time) of all the files within the folder will be updated to the current timestamp, reflecting the time of the copy operation.

 

By default, our system qualifies files for backup based on changes to their mtime (modification time) or ctime (change time). However, if you prefer not to consider changes in ctime for qualification, you have the option to deselect it from the subclient properties.

 

https://documentation.commvault.com/2023e/expert/subclient_properties_general_08.html

 

 

 


 

It's important to note that disabling ctime rule for qualification can have several disadvantages :

1. By excluding ctime from the qualification criteria, you may lose granularity in tracking file changes. This means that certain changes to files, such as metadata modifications (ownership / permission / acl changes), will not be eligible for incremental backups.

 

2. Disabling ctime for backups may lead to inconsistencies with the behavior of the underlying file system. This could cause confusion for users and administrators who expect backup operations to mirror file system changes accurately.

 

Thanks,

Sparsh

Badge +3

Thank you, this is very helpful! We’ll discuss and determine if we want to move forward, or change how we’re doing this. Cheers!

Badge +3

Another quick question:

If we uncheck the “UNIX ctime” box now, move data around using rsync as needed, and re-check the ctime box at a later date, will Commvault back up the data then (assuming no changes to the data except the ctime)? I would assume so, given that the ctime will reflect the date of the rsync instead of when it was originally backed up. If it does back it up with the new ctime, can we expect it to mostly/fully deduplicate? We’re moving around 170TB of data, so it’s not a trivial amount of storage if Commvault backs up the data because of the new ctime and doesn’t fully deduplicate it. 

Userlevel 4
Badge +10

Hi @rebajs 

 

Enabling the ctime property later will not trigger a backup of the entire dataset. Instead, it will only include files whose mtime or ctime has changed since the completion of the last incremental backup.

To confirm this, you can swiftly run a job with a few files and verify the results.

 

Reply