Solved

Pruning DDB performance issues

  • 7 April 2022
  • 5 replies
  • 1110 views

Badge +4

DDB lookup times spiked, SSD are old SATA and under heavy load seem to be very slow. Is pruning only related to a data that is deduplicated? If ORA or SQL DB transaction logs being written to non deduplicated storage policy is pruning at all relevant to that data? How much DDB is being used during pruning? Is it very DDB intensive process?

icon

Best answer by Matt Medvedeff 7 April 2022, 22:24

View original

If you have a question or comment, please create a topic

5 replies

Userlevel 4
Badge +10

Hi @Vitas 

Pruning that leverages the DDB is only for Deduplicated Data. Non-dedup data is pruned directly from the storage by the Media Agent and doesn’t touch the DDB. DDB pruning can be resource intensive depending on the hardware/overall load on the MA.  However, if you are running a V5 DDB you will be using our latest Dedup Engine which can see much better pruning performance and lower Q&I times.

To confirm if you are running a V5 DDB, open the Console → Storage Resources → Deduplication Engines → Find your DDB in question and highlight it → Check the fields in the lower right of the right hand pane and see if Garbage Collection is enabled.

Example:

 

If this shows Disabled, then you need to upgrade your DDB via Commvault Workflow:

https://documentation.commvault.com/11.24/expert/108822_optimizing_deduplicated_database_ddb_pruning_garbage_collection.html

If it’s showing Enabled, then your DDB is already at V5 and I would advise checking the environment for anything that could be impacting the DDB performance. Ensure that the Commvault install and DDB directories are whitelisted from AV scanning, check to make sure the System Uptime is not excessive - and follow the steps here if it needs to be rebooted https://documentation.commvault.com/11.24/expert/12599_restarting_windows_mediaagent.html

 

 

 

 

 

Badge +4

Yes V5, and thanks a lot for the answer. Very helpful. Yes we have SATA older SSD drives, that seem to start being very slow under heavy loads, which seems to be a result of pruning. Interestingly, seems to be pretty hard to see what actual impact pruning does to the system.

Userlevel 4
Badge +10

You can throttle the number of Pruning threads with this additional setting:

https://documentation.commvault.com/additionalsetting/details?name=%22DedupPrunerThreadPoolSizeDisk%22&id=7120

By default we run with 4 Pruner threads, if you suspect this may be too much load for the HW - try lowering to 2 or 1 and observe the performance over a few days.

If you suspect the HW performance is degraded due to age - you can run IOmeter to check the IOPS 

https://documentation.commvault.com/11.24/expert/8825_testing_iops_of_deduplication_database_disk_on_windows.html

Badge +4

@Matt Medvedeff Interestingly enough there are 2 conflicting theories regarding SQL and ORA transaction log deduplication and pruning question. 

There is one from CV support engineer: 

 

“We discussed high Q&I times and any way to reduce the load on the DDB.

We both acknowledged that SQL Transaction Logs are not deduplicated, however you were curious if the T-Log backups communicated in anyway to the DDB which may impose any, albeit minimal, load or processing on the DDB.

I was able to confirm that SQL Transaction Log backups do not communicate with the Deduplication process and therefore will not have any effect on Q&I times:

 

https://documentation.commvault.com/11.24/expert/12434_deduplication_support.html””

 

But but but, other CV engineer says if the storage policy is deduplicated, CV is not smart enough and will try to deduplicate anyway. It will also involve pruning at the end of the cycle. And when I look at the jobs from deduplication engine level, I see my transaction log job there. If it is there, obviously it would require pruning, will it not? And another interesting indicator, while running the trans log job, average throughput line shows DDB look up eats up 95%+ of time. Given these few issues, I am thinking second engineer is more likely right.

 

Which engineer is right? I have V11.20

I have terrible DDB performance due to slow SSD drives, and I am hunting for any even minimal DDB performance improvements venues, without sacrificing too much storage.

Do we have someone very familiar with DDB details?

 

 

 

 

Userlevel 7
Badge +23

Looks like you split the last post into another thread (thanks!) so sharing it here: