Iam troubleshooting a low throughput read issue while recovery . Need some clarification :
When recovery job is initiated in Commvault for a VMs(VSA backup) or databases( streaming backups) does it perform sequential read or random read .
For disk library , what type of disk policies is suitable for better read and write throughput :
Disk Caching
Read Policy
Write Policy
Enabled
Read Ahead
Write Back
Best answer by Onno van den Berg
I personally would set it to Write Through because once the first full have landed you will have benefit from deduplication resulting in much less writes. Doing so makes sure the data lands on the disks which is somewhat saver. Yes, it is somewhat slower but on backup I do not care so much because I prefer data integrity above write performance. In the end it all comes back to reads because once you have recover data then you want to have it back as fast as it can be!
We’re happy to provide additional assistance with troubleshooting the performance issues, but will need some log information to do so. Please let us know if we can be assistance.
Commvault’ s IO profile ranges from 40% read & 60% random during backup/restore operations.
As far as the recommended settings for the disk library, we would need to know the brand/model that you’re using to suggest recommendations. Also, things like raid configuration (if used), Commvault Version etc.
It will perform sequential read operations during recovery tasks. As @NVFD411 already suggested if you need more info then you will have to deliver more information. One thing to check on the storage is the read latency as well as disk utilization in case it is using spindles. As for the recommended configuration it all depends on the kind of disk controller/storage type. Disk caching however will not have a lot of effect.
I personally would set it to Write Through because once the first full have landed you will have benefit from deduplication resulting in much less writes. Doing so makes sure the data lands on the disks which is somewhat saver. Yes, it is somewhat slower but on backup I do not care so much because I prefer data integrity above write performance. In the end it all comes back to reads because once you have recover data then you want to have it back as fast as it can be!
Yes, it is sequential but read ahead without having context with the data will not result in any benefit. This because in case deduplication is enabled hen it will write the unique data in a chunk file but there is a big chance that duplicate data is located in a different chunk file. so yes, the behavior is sequential but it doesn't mean Commvault will have to read all the data segments from the file.
Example: Commvault also offers read-ahead tweaks who offer similar functionality for a cloud library. Now we noticed during recovery operations that if you enable and tweak it aggressively that you will pull more more data from the cloud then required. We sometimes pulled over more than 150% of the amount of data for a particular restore.
Would be awesome if someone from engineering could bridge in here to verify if my dialog is correct and share additional information with us.