Skip to main content

I’m a little confused about multiple partitions and or DDBs limitation. There are several places when limits are mentioned like the one and only Hardware Specifications for Deduplication Mode

1) https://documentation.commvault.com/11.24/expert/111985_hardware_specifications_for_deduplication_mode.html

which mentions “2 DDB Disks” per MA

 

and

Configuring Additional Partitions for a Deduplication Database

2) https://documentation.commvault.com/11.24/expert/12455_configuring_additional_partitions_for_deduplication_database.html

which mentions “30 DDB partitions” per MA.

Also, In my recent discussion with PS member I was told that a partition should be treated as DDB itself.

All of this creates a lot of confusion like
a) is “DDB Disk” the same as DDB?
b) is “DDB partition” the same as DDB?

If you look at 1) /Scaling and Resiliency there is a information

The back-end size of the data. For example: Each 2 TiB DDB disk holds up to 250 TiB for disk and 500 TiB for cloud extra large MediaAgent.

 

That means that a DDB with only 1 partition needs a 2TiB DDB disk for 250TiB backups (backend) and we know from link 1) that we can have maximum of a 2 DDB disks per MA. So the confusion continues

c) does that mean that if I want to store 500TB BET on 1 MediaAgent I need to create 2 separate DDBs aka Storage Pools or can I create 1 DDB aka Storage Pool with 2 partitions hosted on the same MA?
d) taking this even further is it supported configuration to have 4-partitioned DDB aka Storage Pool on 1 MA?
 

If you add up that 1 MA can have a maximum of 2 DDB Disks and (assuming that a DDB disk is the same as DDB) 1 DDB aka Storage Pool can have a maximum of 4 partitions that gives a theoretical maximum of 8 partitions per MA
e) is having 8 partitions on 1 MA a supported configuration?

And finally on modern server you can have 128 physical cores, TBs of RAM and as many as 24 NVMe drives

f) so why there is a 2 DDB disks limitation per MA?

I am really curious about all that and would really like to understand that better.

 

Cheers,

Hey @Robert Horowski 

Its a little vague and we need to tighten this up. There are two modes as described later down the page:

You can use these scaling and resiliency factors to set up partitioned deduplication databases in any of following configurations:

  • Partition mode:

    In this mode only one storage pool is configured using all the MediaAgents in a grid with one, two or four partitions.

  • Partition extended mode:

    In this mode the MediaAgents host partitions from multiple storage pools (up to 20 storage pools per grid). Each storage pool can be configured with one, two or four partitions.

    You can use the partition extended mode in the following scenarios:

    • When you want the primary copy of the data on the disk and the secondary copy on the cloud. In this case, create one disk storage pool and one cloud storage pool using the same MediaAgents.

    • In case of multi-tenancy, where the total back-end size of multiple tenants together is within the limit of the grid. In this case, to segregate data for each tenant you can configure the partition in extended mode by creating separate storage pool for each tenant using the same MediaAgents.

 

Partition extended mode is that ‘2 disk’ configuration you are seeing above - on the first scenario with additional copy, you could host one partition (up to 250 TB backend), or in the second scenario, multiple DDBs which cannot exceed 250 TB backend - all assuming an extra large media agent. This is probably why the term disks is used rather than partitions.

 

On the ‘limits’ - these are not hard limits but a guide to help prevent folks from running into trouble by overcommitting Media Agents and storage.

  • The suggested workloads are not software limitations, rather design guidelines for sizing under specific conditions.

 

I’m sure this does not answer all your questions but hopefully it clears some of it up. I’m going to socialize this post internally a bit to see how we can provide clearer guidance.


Hey @Damian Andre 

Thank you for your input. I’m aware of partitioned extended mode, but what really bugs me is that even after some years of experience with Commvault, I don’t “feel” the difference between having

  • 2 DDB disks configured with 2 DDB aka Storage Pools and
  • 1 DDB aka Storage Pool configured with 2 partitions on 2 separate DDB disks.

It does make a difference for me, cause that would allow me to configure 1 MA to a 500TB BET on the same DDB, but it is not clear if that is a supported configuration.

I think there is a room for documentation improvement at least for the reasons mentioned.

 

Also this guides have been with us for a long time and for the last 4 years (comparing to docs for V11SP6) the only thing that changed for XL MA is only a 25% increase in backend size.

Another thing is that given my experience with Commvault deployments following guidelines in terms of CPU and especially RAM looks like an overkill. I have never seen an XL MediaAgent RAM been utilized more than 20-40%. I’m aware that use cases can vary with different features but maybe a real life MA CPU/RAM utilization should be examined with customers metrics uploaded to cloud and docs updated accordingly. Or even better let all of that unused RAM be a DDB cache :-)

To sum it up, I feel like with features like partitioned DDB, horizontal DDB scaling and modern hardware being able to handle a lot of NVMe disks and high throughput Network Cards a single MA should easily Scale-up way higher than what is advised in docs.

Am I wrong?

 

Cheers.


 

Hey @Damian Andre 

 

To sum it up, I feel like with features like partitioned DDB, horizontal DDB scaling and modern hardware being able to handle a lot of NVMe disks and high throughput Network Cards a single MA should easily Scale-up way higher than what is advised in docs.

Am I wrong?

 

Cheers.

Quite possibly (correct that is) @Robert Horowski - horizontal scale specially provides huge benefits, and I know they wanted to observe some more real-world behavior before making sweeping changes to the recommended scale limits.

I did pass this thread to the head of engineering for this part of the product to review - he is going to take your feedback onboard and look at optimizations.


Thanks! Please update this thread after head of engineering review. It would be great to know if there will be some scale-up recommendation changes or if it’s gonna stay as it is. Either way some information is better than none.

 

Also, I do value your reply, but it does not really answer my original question, and somehow this topic is marked as solved. Can we un-solve it and work on it a little more? 


Any thoughts, anyone? :blush:


I’ll find out :sunglasses:


@Robert Horowski  - Please see my answers to your first set of questions.  

  1. is “DDB Disk” the same as DDB?

FManoj] A given DDB Disk can house multiple DDB Partitions.


b) is “DDB partition” the same as DDB?

:Manoj] Each DDB consists of multiple DDB Partitions.

 

If you look at 1) /Scaling and Resiliency there is a information

The back-end size of the data. For example: Each 2 TiB DDB disk holds up to 250 TiB for disk and 500 TiB for cloud extra large MediaAgent.

 

That means that a DDB with only 1 partition needs a 2TiB DDB disk for 250TiB backups (backend) and we know from link 1) that we can have maximum of a 2 DDB disks per MA. So the confusion continues

lManoj] Ideally different partitions of the DDB should be configured on different Disks. Since a DDB can operate even with 1 or more partitions down - with resiliency options, this configuration allows the DDB to be operations to continue even with a disk failure. Similarly, if the different partitions are configured on different MediaAgents, that will allow for higher resiliency when there is a MediaAgent failure.

 

  1. does that mean that if I want to store 500TB BET on 1 MediaAgent I need to create 2 separate DDBs aka Storage Pools or can I create 1 DDB aka Storage Pool with 2 partitions hosted on the same MA?

oManoj] Each DDB Partition typically handles 250 TB of backend data. So you will need at least a 2 partitioned DDB for managing a backend of 500 TB on the Disk. Can it be hosted on the same MA? Yes, it can be. But as discussed above, if they are on different MediaAgents, there will be higher resiliency for Server failures.  


d) taking this even further is it supported configuration to have 4-partitioned DDB aka Storage Pool on 1 MA?

-Manoj] For XL MAs, we recommend it to be a max of 2 DDB partitions for a given DDB on a single MA. So if you are looking to manage 1 PB of backend data, use 4 partitions and use at least 2 MAs. Some customers use 4 MAs and use 2 separate DDBs with 4 partitions each that spans the same 4 MAs to manage 2 PBs of total backend data. DDB1 will use Disk1 from each MA. DDB2 will use Disk2 from each MA. 

 

If you add up that 1 MA can have a maximum of 2 DDB Disks and (assuming that a DDB disk is the same as DDB) 1 DDB aka Storage Pool can have a maximum of 4 partitions that gives a theoretical maximum of 8 partitions per MA

rManoj] No, that’s not the way it is intended as explained above.


e) is having 8 partitions on 1 MA a supported configuration? f) so why there is a 2 DDB disks limitation per MA?

And finally on modern server you can have 128 physical cores, TBs of RAM and as many as 24 NVMe drives

bManoj] In the documentation we have specified upto only 16 cores and 128 GB of RAM (XL MA). This configuration is rated for 2 DDB disks. Larger servers can have more DDB disks, but the exact specs and the DDB ratings will have to be added in BOL. This will be coming up in the near future.


Hi @ManojKVijayan !

Thank you, that is a perfect answer!

Still, since I am a little confused about all that, I  just want to get this right and if you don't mind I would love to take it a little bit further.

 

d) taking this even further is it supported configuration to have 4-partitioned DDB aka Storage Pool on 1 MA?

9Manoj] For XL MAs, we recommend it to be a max of 2 DDB partitions for a given DDB on a single MA. So if you are looking to manage 1 PB of backend data, use 4 partitions and use at least 2 MAs. Some customers use 4 MAs and use 2 separate DDBs with 4 partitions each that spans the same 4 MAs to manage 2 PBs of total backend data. DDB1 will use Disk1 from each MA. DDB2 will use Disk2 from each MA. 

 

That is how I am deploying it too in case of large deployments.

The reason why I am chasing this ‘limits’/guides/supported configuration is that from sales perspective sometimes the customer budget is tight and offering a 4 MA instead of 2 MA can rule us out of the equation. I would never sell unsupported configuration so that is why I want to have a good understanding of what is and what isn’t considered a supported configuration still being aware of the solutions level of resiliency. 

 

So to make sure I understand you correctly, if we separate Resiliency from Scaling and focus only on a latter then, it is a supported configuration to have 1 DDB aka Storage Pool consisting of 2 partitions hosted on 1 MA. Is that correct? I understand that resiliency would suffer in that case, but we are focusing on scaling only.


Also should I understand the following guide

Configuring Additional Partitions for a Deduplication Database

2) https://documentation.commvault.com/11.24/expert/12455_configuring_additional_partitions_for_deduplication_database.html

which mentions “30 DDB partitions” per MA.

 

that as long as I fit into recommended backend size, I can have as many as 30 DDB partitions on single MA?

 

Cheers!


Hi @Robert Horowski 

 

How much capacity a MA can handle is purely based number of DDB disks and their size.

Number of partitions has nothing to do with capacity planning. How many partitions to use will be based on how many DDB disks and MAs are there. It’s a configuration item.

 

One XL MA with 2 DDB disks gives you 500TB capacity regardless of how many partitions you create.

 

Simple steps to follow:

 

Step 1:  Determine how much storage capacity needed.

Step 2: Use below table to determine how many MediaAgents needed for your required storage capacity.

There could be multiple combinations giving the same capacity, you can pick any of the combination by considering other factors like cost, resiliency and future expansion.

 

Example: For 300TB capacity, you can pick either of the below config.

  1. One Large MediaAgent with 2 DDB disks.
  2. Two Medium MediaAgents with 2 DDB disks. This config will give you resiliency and backups runs even if one MediaAgent goes down temporarily.

Back-end Size10 for Disk Storage

 

Extra large

Extra large

Large

Large

Medium

Medium

Small

Extra Small

Grid

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

1 DDB Disk

1 Node

Up to 250 TiB

Up to 500 TiB

Up to 150 TiB

Up to 300 TiB

Up to 75 TiB

Up to 150 TiB

Up to 50 TiB

Up to 25 TiB

2 Node

Up to 500 TiB

Up to 1000 TiB

Up to 300 TiB

Up to 600 TiB

Up to 150 TiB

Up to 300 TiB

Up to 100 TiB

Up to 50 TiB

3 Node

Up to 750 TiB

Up to 1500 TiB

Up to 450 TiB

Up to 900 TiB

Up to 225 TiB

Up to 450 TiB

   

4 Node

Up to 1000 TiB

Up to 2000 TiB

Up to 600 TiB

Up to 1200 TiB

Up to 300 TiB

Up to 600 TiB

 

 

 

 

Step 3: After determining number of MediaAgents, use below guidelines on how to configure storage pool with how many DDB partitions.

  1. Count total DDB disks across all MediaAgents. Use this count as number of partitions.
  2. If disk count is higher than 4 then use 4 partition.
  3. If disk count is one then use 2 partitions to create storage pool for Medium or higher config MediaAgent. This will allow future expansion easy.

 

Example:

 

Case 1 :  300TB capacity,  Two Medium MediaAgents with 2 DDB disks each.

  1. Total DDB disks 2 x 2 = 4.
  2. Create 4 partition storage pool.

 

Case 2: 300TB capacity, One Large MediaAgent with 2 DDB disks.

  1. Total DDB disks 1 x 2 = 2.
  2. Create 2 partition storage pool.

Hi @Prasad Nara 

Thank you for taking the time to describe the flow of capacity planning. This is very straight forward and I agree that this is the way to go.

 

One XL MA with 2 DDB disks gives you 500TB capacity regardless of how many partitions you create.


That is true, however in reality you don't know what BET you will end up with. You can and should estimate that of course but estimation may be more or less accurate. And I would say that throwing all of your FET into
A) 1 Dedup Store configured as 3 partitioned DDB will scale similar to
B) 3 Dedup Stores configured as 1 partition DDBs,
but will give you less BET. And less BET is different Hardware sizing.

Isn't that right?


Hi @Robert Horowski 

A) 1 Dedup Store configured as 3 partitioned DDB will scale similar to
B) 3 Dedup Stores configured as 1 partition DDBs, but will give you less BET. And less BET is different Hardware sizing.

 

That’s may not be correct, both will give same BET if number of DDB disks and size of them is same for both the configurations. 

BET sizing depends on MA and DDB disk size and not on number of Dedupe stores or partitions. 

 

2TB DDB disk gives up to 250TB BET regardless of how many dedupe stores or partitions configured. 


Hi @Prasad Nara 

 

2TB DDB disk gives up to 250TB BET regardless of how many dedupe stores or partitions configured. 

 

You are right of course, but I guess that what I am trying to say is that if you have 500TiB FET and you throw it into 1 Dedup store it will end up as less BET than the same amount of FET thrown into 3 dedup stores. So that is 1 of the reasons you would like to have 1, multi-partitioned Dedup store instead of several 1-partitioned Dedup stores.

So if you look at BOL it’s not really clear if a multi-partitioned Dedup store is a supported/recommended configuration on 1 MA, but fortunately @ManojKVijayan already explained that it is.

I hope BOL will be updated soon not only to reflect that but also to reflect gains from features like horizontal DDB scaling among other.

 


Hi @Robert Horowski 

In general always create one storage pool per site (storage cluster/grid). As Manoj said multi-partitioned DDB is supported on single MA. We will update this in the BOL. 

Thanks for your feedback on this. 

 

 


Reply