Solved

AWS EC2 snapshot backup copy performance

  • 16 June 2022
  • 8 replies
  • 315 views

Badge +8

We  use Intellisnap to create AWS EC2 snapshot, and run backup copy to move them to AWS S3.  Backup copy through is only 50GB per hour. I see backup job transport type is EBS-Direct-API.

 

We don’t deploy EBS-Direct-API, should spend time to review it internally. Any chance to change transport type to get best performance?

icon

Best answer by Mathew Ericson 21 June 2022, 19:19

View original

8 replies

Userlevel 7
Badge +19

@xiwen have you checked you resource utilization on you MAs? you should definitely stick to EBS-Direct-API as this is the most efficient way of performing backup copies in AWS. it does not require you to have access nodes mount the VM disks to be able to perform backup copies but instead it requests the unique changed blocks directly from the EBS API and than copies those blocks directly to the backend storage. 

it could be that the figures as presented are not correct or accurate hence I would first check to see if you are running into a system wide bottleneck like CPU or network interface congestion. 

which version are you currently running? 

Badge +8

currently I’m running on 11.24.34. CPU usage is about 10%, memory usage is about 58%, Ethernet usage is about 200Kbps(Send), 20Mbps(Receive)

Userlevel 7
Badge +19

Is it a single stream or is it the entire job that is showing a throughput of 50GB/hour? I assume you also checked your DDB latency?  

Have you checked the documentation and do you meet the requirements → https://documentation.commvault.com/2022e/expert/130078_ebs_direct_api_backups_for_amazon_web_services.html

EBS-Direct traffic is a HTTP stream so also make sure you are not passing the traffic through a proxy. 

Otherwise please open a ticket and let them check it out.

Userlevel 4
Badge +6

We  use Intellisnap to create AWS EC2 snapshot, and run backup copy to move them to AWS S3.  Backup copy through is only 50GB per hour. I see backup job transport type is EBS-Direct-API.

 

We don’t deploy EBS-Direct-API, should spend time to review it internally. Any chance to change transport type to get best performance?
 

@xiwen  - what is the size of the Access Node performing the EBS direct API read or write?

Amazon publishes performance optimization notes on EBS direct APIs, and claim 500 MB per second is achievable
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebsapi-performance.html

 

There are two factors

Badge +8

Currently I run EBS Direct API for one testing server only, EBS size is 120GB, actually size is about 20GB.

 

Q1, I don’t see permission request on PutSnapshotBlock from Commvault document, shall we add it on IAM role?

You can run API requests concurrently. Assuming PutSnapshotBlock latency is 100ms, then a thread can process 10 requests in one second. Furthermore, assuming your client application creates multiple threads and connections (for example, 100 connections), it can make 1000 (10 * 100) requests per second in total. This will correspond to a throughput of around 500 MB per second.

 https://documentation.commvault.com/11.26/expert/130078_ebs_direct_api_backups_for_amazon_web_services.html

Q2, I got below word from word from below link, which quota should be changed? GetSnapshotBlock ?PutSnapshotBlock? what is proposal number? or how to calculate it?

The service quota for GetSnapshotBlock requests per account per Region is 1,000 per second by default. To increase the service quota limit, you must open a ticket with AWS.

 https://documentation.commvault.com/11.26/expert/130078_ebs_direct_api_backups_for_amazon_web_services.html

 

Userlevel 4
Badge +6

Currently I run EBS Direct API for one testing server only, EBS size is 120GB, actually size is about 20GB.

 

Q1, I don’t see permission request on PutSnapshotBlock from Commvault document, shall we add it on IAM role?

You can run API requests concurrently. Assuming PutSnapshotBlock latency is 100ms, then a thread can process 10 requests in one second. Furthermore, assuming your client application creates multiple threads and connections (for example, 100 connections), it can make 1000 (10 * 100) requests per second in total. This will correspond to a throughput of around 500 MB per second.

 https://documentation.commvault.com/11.26/expert/130078_ebs_direct_api_backups_for_amazon_web_services.html

Q2, I got below word from word from below link, which quota should be changed? GetSnapshotBlock ?PutSnapshotBlock? what is proposal number? or how to calculate it?

The service quota for GetSnapshotBlock requests per account per Region is 1,000 per second by default. To increase the service quota limit, you must open a ticket with AWS.

 https://documentation.commvault.com/11.26/expert/130078_ebs_direct_api_backups_for_amazon_web_services.html

 


PutSnapshotBlock is required, this is a documentation bug which I have logged an urgent fix request for.

PutSnapshotBlock is the service quota to increase - but I would suggest to review your EC2 instance size first - as often the instance size is artificially limiting resource throughput due to the network credits provided on the instance size.

 

Please contact Commvault support who can assist providing the correct IAM policy

Userlevel 7
Badge +19

@Mathew Ericson looking at it from a MSP perspective but have you added some clear event notification during the processing of jobs that the software is hitting "walls” that require the end-customer to open a ticket at AWS to have their limits being increased? 

Userlevel 4
Badge +6

@Mathew Ericson looking at it from a MSP perspective but have you added some clear event notification during the processing of jobs that the software is hitting "walls” that require the end-customer to open a ticket at AWS to have their limits being increased? 

Yes @Onno van den Berg  the vsbkp.log or vsrst.log will clearly indicate when  threshold is being encountered on the Amazon side - we don’t modify the response from the AWS API endpoint and log that in the job log.

Reply