Solved

Performance question - Best processor option for MediaAgents

  • 28 June 2022
  • 7 replies
  • 470 views

Badge +5

Hello!

I would like to ask about the best processor options for a MediaAgent.

Right now we have followed the MediaAgent hardware requirements from the documentation. So I have a XL MediaAgent with 16 CPU cores.

Backups run just fine except for this big database (more than 100TB) that runs a synthetic full every two days. It takes about 9 hours to finish. Although the customer is happy since the old backup solution would take days for the same result, I would like to know if that time could be improved.

Performance monitoring tells me that the bottleneck is the MediaAgent’s CPU. All 16 cores run at 100% while the synthetic full backup is running. So… I would like to know the following (for this current setup and also for the new MediaAgents that will be added to this CommCell):

  • Will adding more cores help? For example, adding a second processor with 16 cores to the current MediaAgent (although it would double the recommended hardware requirements from documentation).
  • Or would it be better to get a CPU with higher frequency (same core count)?
  • Does it make a difference for the commvault software to have single-processor or dual-processor mediaagents? For example: one 24-core cpu vs two 12-core cpus (same frequencies).
  • If adding more processing power, would more RAM (or anything else) be needed even if right now it barely hits 35% usage peaks?

Thanks in advance for the help!

 

Sergio

 

 

icon

Best answer by Onno van den Berg 1 July 2022, 00:17

View original

7 replies

Userlevel 3
Badge +10

In my experience media agents typically are not processor bound. Memory size  and Disk Speed are much more important.

 

there is a sizing calculator out there on maintenance advantage that is actually pretty good. 

Userlevel 7
Badge +19

Your high CPU usage can also be the result of disk IO latency due to IO-wait. So have you checked the latency on the disk carrying the DDB? Also take notice of the following KB article: https://kb.commvault.com/article/SYN0003

All-in-all interesting question: curious what the outcome is going to be.

Userlevel 1
Badge +5

Hi Sergio,

XL Media Agent with 16 CPU with 64 RAM is good enough to handle 100 TB’s of DB data.

 

My suggestion to have a check on the DDB disks. If it is a physical MA’s, it is highly recommended that DDB disks should be an inbuilt SSD disk. SSD from shared storage mapped to MA’s will also cause issues on the later stages. 

Thanks
 

Userlevel 7
Badge +23

@Sergio V - curious on the frequency of the synthetic full backups - why every two days for that big database?

Badge +5

Hi All,

Thanks for the answers. Some more info:

  • Sizing was calculated using commvault’s calculator (actually, Commvault engineers helped with it).
  • DDB disks are internal SSD drives (RAID1). Usage is about 100MB/s while synthetic fulls are running (a lot less than what the SSD drives support). I will check latency tomorrow (when the next synthetic full runs).
  • About the backup frequency - I just rechecked and it’s only thursdays and sundays for the 100TB BD. However, the customer wants to run them every other day eventually (or everyday if possible). It’s because they want to have full backups ready for restore as soon as possible, without having to wait for commvault to integrate the last full and incrementals in the restore window itself.

I’ll make sure to use better SSDs for our new MediaAgents. Still need to figure out what the processor configuration should be (stick with 16cores per MA? go with more cores? single-processor or dual-processor?)

Regards,

Sergio

Userlevel 5
Badge +16

Have you considered differential backups in addition to your schedule rather than adding more fulls on a DB that large?

You have to realize that there are physical limits imposed by every link in the chain from network to storage.

In other words do you even have enough bandwidth to move 100TB of data in a reasonable time?

Keep in mind that many of the speed measurements are synthetic in the sense that it is based on the calculation of data processed rather than actual data moved.

That said you should start with a deep analysis of what your underlying storage is doing, as once again the pressure is almost always on storage for things like backups and aux copies. 

The operations that I have seen that bring a media agents processor to its knees involve encryption.

 

You might also want to post what processes are taking up the most CPU under load.

 

I’ll be honest I barely read the initial post as CPU bound media agent operations are just not something I have seen.

Userlevel 7
Badge +19

@christopherlecky he is creating synthetic full backups so there is not a lot of network traffic involved.

@Sergio V even SSDs have their limits most of the times the interface is the bottleneck. Moving to a NVMe PCI-e card most of the times does the trick and reduces the latency even further.

Reply