Question

AWS S3: Please reduce your request rate

  • 15 August 2023
  • 2 replies
  • 1746 views

Userlevel 1
Badge +4

We see errors like below. Current solution from Commvault support is to limit the number of parallel tasks.

30858 5816 08/09 07:07:48 24447881 [cvd] CloseFile() -xxxxxx-eu-central-1/GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706210/CHUNK_339211994, mode = WRITE, error = Please reduce your request rate.

 

I also contacted AWS support and the answer was very detailed and explained how to improve the situation on software side,
 


As you may already know, S3 has the capability to automatically scale its capacity as request rates increase for each prefix. In most scenarios, your application can potentially achieve up to 3,500 PUT/COPY/POST/DELETE requests per second per prefix. Due to this, there are no adjustments that can be made to your requests. However, to fully utilize S3's automatic capacity expansion, your workload should consistently process 3,500 or more requests per second, rather than occasional bursts of 3,500 requests per second. Furthermore, since S3's request limits are linked to prefixes, it is worth considering a restructuring of your key naming scheme. This is important because S3 follows a flat structure without hierarchical organization. Folders are used primarily for organizational purposes. As an example, we can see that the throttling occurred on the files names provided. Based on the file names, it is likely that the request limit is shared up to the CV_MAGNETIC prefix.

GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706210/CHUNK_339211994 GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706146/CHUNK_339232742 GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706211/CHUNK_339232756

Instead, it might be beneficial to avoid sequential prefixes to enable additional layers of S3 performance. This is done by adding a three or four character hashed prefix at the beginning. This will ensure unshared S3 performance for each file.

ba65-GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706210/CHUNK_339211994

9810-GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706146/CHUNK_339232742

7b54-GX4JNW_06.08.2020_05.24/CV_MAGNETIC/V_4706211/CHUNK_339232756

The unique hashes at the beginning segregate the files into distinct partitions within S3, resulting in separate request limits for each file. Keep in mind that this approach might complicate file sorting and organization. To summarize, consider revising your key naming scheme to accommodate unshared S3 request limits. Keep in mind that the inclusion of more unique hashes in the naming scheme will result in a greater number of partition levels. For additional partition levels, the prefix does not have to start or end with a /.

 

Is this something that Commvault is aware of? Any enhancements planned? Not sure if the suggestion of AWS support makes sense here, but we see this throttling a lot.


2 replies

Userlevel 7
Badge +23

Hi Pirx,

We had guidance around randomizing the folder names to achieve better performance many years ago - then enhancements in AWS and Commvault meant that it wasn't required and we removed that guidance, explicitly so in the architecture guide.

I have not come across customers hitting an API rate limit like this before. That being said, you can add multiple mount paths to a cloud library which will give you exactly what AWS is suggesting - each mount path can have a different base folder name which will then split requests across both.

It might not help immediately as all your data is under that one path - but it could help eventually moving forward. But as mentioned, I have not seen an AWS S3 request rate limit hit like this before - so I suspect there may be something else going on to be causing this.

Userlevel 7
Badge +19

I assume you could clearly see that jobs are impacted from the job controller? 

In the past it was indeed a best practice to create multiple buckets to increase performance and mitigate the effects of throttling by AWS. However, AWS addressed the underlying bottlenecks and removed the best practices from te records. Basically, we are now hitting a side-effect within Commvault as they approach a cloud library is a similar way as a disk library referring to the object structure. The problem here is that Commvault creates only one root folder and from there it starts to build-up it's structure. As long as Commvault does not change this you might want to implement a cloud library as of the start with multiple mount paths.

Reply