Question

Single Item Restore Performance

  • 7 February 2024
  • 3 replies
  • 73 views

Badge +2

Hi guys :)

Why is the restore performance so bad for big single item files like vmdk’s (100-400GB/hr)?

Got the information that restores (for single Item) only reads from four disks simultaniosly, and the disks are HDDs, so random reads/throughputs are usually slow (by design). Support can’t help because it’s not an issue, and we have to live with it (if we want to stick with HSX). But thats not acceptable by our customers. Is there anything we can do?

Thx

Kr


3 replies

Userlevel 6
Badge +15

 @Fusi 

The performance will be bottlenecked by your hardware. Your hardware is HDDs.. so as you mention reads/throughputs are by design slow. 

You can use this tool to validate your hardwares/mount paths performance: https://documentation.commvault.com/2023e/expert/validating_mount_path.html


However, it would appear that support has already analysed the performance and validated it is the hardware? What is the case number?

If your customers aren’t happy with the outcome, have you established what is acceptable to them? How fast should it be? What are their RPOs? 

If you’re hardware can’t meet these requirements, either the expectations need to be reset based on your current capability or you need to investigate faster hardware to meet their requirements.

HTH

Regards.

Chris​​
 

Userlevel 7
Badge +23

 Hi @Fusi,

Missing some context here to provide a solid answer, but in a lot of cases, individual items or granular recoveries are single stream restores. Quite often better performance is gained with multiple streams, not only when reading data, but when writing data to the application as well.

Badge +2

Hi!

During design phase, we got from CommVault the following perfomance sheet:


The two models tested are the following:

 

• N12 - cluster of three 2U, 12x14 TB LFF HDD servers

• N24 - cluster of three 2U, 24x8 TB LFF HDD servers

 

Commvault HyperScale X RA Performance

 

Operation N12 N24 Notes

Baseline backup 11.7 TB/hour 12.5 TB/hour

Baseline backup assumes there is no deduplication in the data stream. The only storage optimization is due to 50% data compression. First full backups are considered baseline.

 

Subsequent backup 64 TB/hour 100 TB/hour

Assumes 2% daily change rate, daily incremental backup and a weekly full. Most of the performance gain is due to deduplication, eliminating more than 90% of the writes.

 

Restore 8.2 TB/hour 11 TB/hour

Full restore of the system with overwrite option turned on, ensuring all data gets rewritten back to the source system.

 

Aux Copy 12 TB/hour 12 TB/hour

Full copy of data from the appliance to an external target.

 

DASH Copy 26.7 TB/hour 26.7 TB/hour

Full copy of data from the appliance using deduplication.

 

Synthetic Full 26.1 TB/hour 44 TB/hour

Create a full backup from the latest full backup and subsequent incremental backups.

 

VSA baseline backup 7.8 TB/hour 7.8 TB/hour

HyperScale nodes as VSA proxies to perform baseline backup of VMware VMs.

 

VSA subsequent full 18 TB/hour 18 TB/hour

Assumes 2% daily change rate, daily incremental backup and a weekly full. Most of the performance gain is due to deduplication, eliminating more than 90% of writes.

 

VSA based VM restores 5.2 TB/hour 5.2 TB/hour

Full restore of the VMware VMs

 

 @Fusi 

The performance will be bottlenecked by your hardware. Your hardware is HDDs.. so as you mention reads/throughputs are by design slow.

By the design sheet the bottleneck for VM restore shold be aboe 5.2 TB/hr. During the incident we got the information that single stream writes/reads to/from only four HDDs, and not all 24 in the system. But that information was never told to us.

You can use this tool to validate your hardwares/mount paths performance: https://documentation.commvault.com/2023e/expert/validating_mount_path.html

Thanks, we did enough testing


However, it would appear that support has already analysed the performance and validated it is the hardware? What is the case number?

Yes, “Commvault HyperScale™ X Validated Reference Design Performance”. Can you look into that case? Don’t know if i can hand that out because of customer detailed informations - security.

If your customers aren’t happy with the outcome, have you established what is acceptable to them? How fast should it be? What are their RPOs?

RPO is a scheduled backup every 15min, thats not the problem. RTO is (by luck) not yet defined. If we can achieve >=1TB/hr, i guess customer would agree to that.

If you’re hardware can’t meet these requirements, either the expectations need to be reset based on your current capability or you need to investigate faster hardware to meet their requirements.

We can’t change the hardware. It’s already fully implemented, and customer payed for that an expects the given values (Or the 1TB/hr).

HTH

Regards.

Chris​​

 

 Hi @Fusi,

Missing some context here to provide a solid answer, but in a lot of cases, individual items or granular recoveries are single stream restores. Quite often better performance is gained with multiple streams, not only when reading data, but when writing data to the application as well.

What additional context would you need? Yea, but in our case we need fast single stream restore, as vmdks can’t be backuped/restored with multiple streams

 

 

Reply