Answer

Low deduplication ratio on Oracle RAC DBs: why?

Forum|Forum|4 years ago
December 2, 2021
10 replies
645 views

Stefano Castelli

Hello everyone and thanks in advance for any hint.

I’m running a Commvault V11Sp20 installation of the software and recently had a 6TB Oracle Database migrated from a stand-alone Oracle 12 DB to a three-node Oracle RAC 19 installation.

Backups work properly and, “randomly” the three nodes are involved in backup jobs.

Problem is: dedupe ratio is low.

While the stand-alone data reached 95/97% of dedupe ratio, these backups for the Oracle RAC instances reach 40/65% of dedupe ratio. That is, a single Full backup takes something like 3TB of disk library (the very same disk library hosting the GDP previously used for storing the stand-alone Oracle DBs backups).

Any hints about why the dedupe ratio is so low?

Thanks in advance for your kind opinion.

Regards

Best answer by Mike Struening

Hmmmm….you COULD have some stale blocks, though this is quite a rabbit hole :sunglasses:

Do you have more than 1 DDB store on the library? Could have a corrupt store as well.
Does the library support sparse files/drilling of holes? That could be an issue if we can’t free out space within the chunks.
What is the actual library itself? Assuming not Cloud because you said disk library, but important to cover.

Now assuming you don’t have sparse file support, you can/should runa space reclamation (which I believe is what you were asking about):

https://documentation.commvault.com/11.25/expert/127689_performing_space_reclamation_operation_on_deduplicated_data.html

This could be the answer to that woe, though the dedupe ratio on the Oracle backup is another matter. If that data is moving around somehow, and the blocks change? That would do it, though for a really full detailed investigation, I’d get a support case created (share the incident number with me to follow up). There’s so many factors to even consider :nerd:

+23

Mike Struening
Vaulter
Forum|Forum|4 years ago
December 2, 2021

Hi @Stefano Castelli , and welcome to the community!

My expectation here is that if the migration created new unique blocks, that our dedupe ratio would be poor. Now, subsequent Fulls should be better, though much of that depends on how the data is stored, changed, etc.

How many Fulls have run post migration? Do you have ratios per each of those?

Thanks!

https://www.linkedin.com/in/michael-struening

Stefano Castelli
Author
Forum|Forum|4 years ago
December 2, 2021

Hi @Stefano Castelli , and welcome to the community!

How many Fulls have run post migration? Do you have ratios per each of those?

Thanks!

Hello Mike and thanks a lot for the answer and for checking the thread.

Yep, that was what I was expecting, yet the ratio on following Full backups is still quite high, even though it is really slowly improving.

What “scares” me is that out of the Oracle RAC backups I get about 20 TB of data written on disk from the reports.

And yet, according to the storage report, these 20 TB account for 89% of the disk library.

Now, if the disk library is about 38 TB, the math is weird here.

IS there a way I can check if “orphaned” data is clogging the library?

I ran the Retention Forecast Report and it is “clear” of unprunable jobs.

Any idea?

Thanks in advance.

Regards

Stefano Castelli - Data Protection Engineer

+23

Mike Struening
Vaulter
Answer
Forum|Forum|4 years ago
December 2, 2021

Hmmmm….you COULD have some stale blocks, though this is quite a rabbit hole :sunglasses:

Do you have more than 1 DDB store on the library? Could have a corrupt store as well.
Does the library support sparse files/drilling of holes? That could be an issue if we can’t free out space within the chunks.
What is the actual library itself? Assuming not Cloud because you said disk library, but important to cover.

Now assuming you don’t have sparse file support, you can/should runa space reclamation (which I believe is what you were asking about):

https://documentation.commvault.com/11.25/expert/127689_performing_space_reclamation_operation_on_deduplicated_data.html

https://www.linkedin.com/in/michael-struening

Stefano Castelli
Author
Forum|Forum|4 years ago
December 3, 2021

Hello again and thanks a lot for the reply.

The library in use is a local array of disks in a physical Media Agent.

It contains just a single DDB. DDB Verification jobs run regurarly without resulting in issues.

I’ll check about the sparse files configuration, thanks a lot.

Actually I already ran the space recmaimation job but it did not - ehm - reclaim that much in comparison to the total (about 550GB using the highest setting).

As you say, a support case would the best thing now, I’ll ask the customer to open one while I investigate.

Thanks a lot

Stefano Castelli - Data Protection Engineer

+23

Mike Struening
Vaulter
Forum|Forum|4 years ago
December 3, 2021

Yeah, definitely the best action now.

Once you have an incident created, let me know the case number so I can follow up and monitor.

Thanks, @Stefano Castelli !!

https://www.linkedin.com/in/michael-struening

+23

Mike Struening
Vaulter
Forum|Forum|3 years ago
January 24, 2022

@Stefano Castelli , following up, did you ever get this addressed (or an incident created to track it down)?

Thanks!

https://www.linkedin.com/in/michael-struening

+23

Mike Struening
Vaulter
Forum|Forum|3 years ago
February 25, 2022

@Stefano Castelli , following up before marking this solved. Were you able to get this answered/fixed on your end?

https://www.linkedin.com/in/michael-struening

Frederico
Bit
Forum|Forum|3 years ago
May 2, 2022

Hello,

We have same situation here.

With Oracle cloud storage : dedup ratio is only 20% (instead of more than 90% before migration to Cloud).

Did you find any solution?

Many thanks for sharing :)

+23

Mike Struening
Vaulter
Forum|Forum|3 years ago
May 3, 2022

@Frederico , what kind of data are you sending to the library? Also, Primary copy or secondary?

We can dig into it though we’ll need to get some information first.

https://www.linkedin.com/in/michael-struening

Frederico
Bit
Forum|Forum|3 years ago
May 4, 2022

Hello @MikeRoSoft, thanks for your answer !

We are sending pirmary copies to OCI (Oracle Cloud Infrastructure) Cloud Library.

All data are Oracle DB and archive logs.

Don’t hesitate if you need more information.

Thanks

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded