Disk library and physical Media Agent change as disaster recovery.

  • 24 May 2022
  • 1 reply


Hi all!

My company, using six MA to create and store backups. One MA with a separated storage for long term retention outside, and an another one for create local backups on a branch office site. On the main site there are four MA in two node grids. MA1 & MA2 is a grid and MA3 & MA4 is an another. They are sharing their libraries and DDBs.
From the branch office, local backups are copied to the main site and main backups are copied to long term site as DR backups. MAs are physical on main site and virtual on others, and disk storages are used on all sites.

Currently, we are planing to change our disk storages and physical MAs on main site. And of course, it is a good chance to upgrade OS on MAs from Win2012R2 to Win2019. During the process, library content should be moved from the old disk storage to the new one, and DDBs from old MA to new. One MA stores 40 - 60 TB backup data, and of course, I would like to do it with minimum downtime. 
I have found descriptions about library move and DDB move processes between MAs. But after these steps, Subclients and Storage Policy copies should be configured for new MA. Therefore, I'm planing to change MAs as MA disaster recovery. It means, following steps:

- Install OS on new physical server with new name and IP.
- Configure disks (library & DDB) as it is on the old MA. (library disks from new storage, shares also created)
- Copy the content of libraries, DDB and Index Cache from old MA to new disks on new MA. It mainly can be done during operation by RoboCopy tool. RoboCopy can be started several times to refresh changes during CommServe operation. 
- Disable All Job Activity and Scheduler on CommServe. Finish all running jobs.
- Stop Commvault services on old MA.
- Start RoboCopy to refresh last changes on new MA. Check RoboCopy logs to be sure, every modification transferred to new MA.
- Shutdown old MA
- Rename new MA to the same name with old and change IP.
- After new MA restarted, install MA components remotely from Commcell Console.
- After installation, check readiness and DDB status.
- Enable All Job Activity and Scheduler on CommServe

I have installed a test environment with two MA in grid and third MA for aux copies. Some backups and copies was made for a week, and steps above was done. New MA operates without any issue. Only Index Cache had to be reconfigured, because that was default after MA installation. Therefore I think, the described process above can work, but I would like to avoid hidden traps.
If something goes wrong before activity enabled on CommServe, hopefully old MA can be restarted, but I like to be as prepared as possible.

What do you think about this idea?

1 reply

Userlevel 7
Badge +23

Looks solid to me.  Since you are copying and not moving the data (and keeping the same names and IP addresses), you can always fall back to brining the current MA back online if anything fails, but your process looks sound to me!