Question

Unexpected Comunication between MA server

  • 17 December 2023
  • 4 replies
  • 53 views

Userlevel 2
Badge +9

I created a storage policy and changed the storage policy from one of our Postgres clients to this new one storage policy.

 

Storage Policy used for all Archive Log backups from the old one to new one: SP_PJe-2G. 

 

Data Storage Policy from old one to new one: same SP_PJe-2G. 

 

I ran some Incremental jobs, just log backups, and today I realize that during the Full Backup Job there was a comunication between the new MA server cvault-ma-20 and the old one, cvault-ma-02. 

 

This new storage policy has the following Data Path Properties. 

 

The MA server cvault-ma-02 is not a member of the Storage Pool STGPOOL_SC05.

Why the new MA server, cvault-ma-20, is talking to the old MA server cvault-ma-02? 

 

ANOTHER QUESTION: 10.1.66.47 is the other MA server on Storage Pool STGPOOL_SC05. I suppose that there is communication between the client and both of two servers because of deduplicating process. In few words can someone tell me what really occurs during a deduplicating backup job in terms of Query and Insert hashes? 


4 replies

Userlevel 7
Badge +23

Media agents also store the index for clients, and they will try to access the index cache data from each other. Clients have a ‘home’ Media Agent for their index that gets established on the first backup. You can manually move the index location for clients using this workflow:

https://documentation.commvault.com/2023e/expert/changing_indexing_mediaagent_using_change_index_server_workflow.html

 

Depending on how you configured the storage policy, you were asked to create a new deduplication database or leverage global deduplication (point the storage policy to an existing database). If using global deduplication then its likely the client will be communicating with the prior Media Agent that hosts the deduplication database. Data segments are read in blocks (typically 128K) and signatured - that signature is then checked with deduplication database to see if its unique or something we’ve seen before.

Userlevel 2
Badge +9

Media agents also store the index for clients, and they will try to access the index cache data from each other. Clients have a ‘home’ Media Agent for their index that gets established on the first backup. You can manually move the index location for clients using this workflow:

https://documentation.commvault.com/2023e/expert/changing_indexing_mediaagent_using_change_index_server_workflow.html

 

Depending on how you configured the storage policy, you were asked to create a new deduplication database or leverage global deduplication (point the storage policy to an existing database). If using global deduplication then its likely the client will be communicating with the prior Media Agent that hosts the deduplication database. Data segments are read in blocks (typically 128K) and signatured - that signature is then checked with deduplication database to see if its unique or something we’ve seen before.

Thanks. The first part of your response was awesome! I still need some clarification on the backup process with deduplication support.

 

On our case, We’re using global deduplication with Source-Side (Client-Side) Deduplication. 

“Data segments are read in blocks (typically 128K) and signatured - that signature is then checked with deduplication database to see if its unique or something we’ve seen before.” 

This “checking” is what Commvault call a lookup. Right? On official Documentation they call this phase as Signature comparison. 

 



What happens when the database is partitioned? 

 

Does Commvault software on client need to check the signature on both partitions at different MA’s? If so, this explains this communication between the client and both MA's at the same time.

Or one MediaAgent consult each other? On this case, I should notice some traffic between these two Media Agent Servers. 

 

 

 

 

 

 

 

Userlevel 7
Badge +23

Media agents also store the index for clients, and they will try to access the index cache data from each other. Clients have a ‘home’ Media Agent for their index that gets established on the first backup. You can manually move the index location for clients using this workflow:

https://documentation.commvault.com/2023e/expert/changing_indexing_mediaagent_using_change_index_server_workflow.html

 

Depending on how you configured the storage policy, you were asked to create a new deduplication database or leverage global deduplication (point the storage policy to an existing database). If using global deduplication then its likely the client will be communicating with the prior Media Agent that hosts the deduplication database. Data segments are read in blocks (typically 128K) and signatured - that signature is then checked with deduplication database to see if its unique or something we’ve seen before.

Thanks. The first part of your response was awesome! I still need some clarification on the backup process with deduplication support.

 

On our case, We’re using global deduplication with Source-Side (Client-Side) Deduplication. 

“Data segments are read in blocks (typically 128K) and signatured - that signature is then checked with deduplication database to see if its unique or something we’ve seen before.” 

This “checking” is what Commvault call a lookup. Right? On official Documentation they call this phase as Signature comparison. 

 



What happens when the database is partitioned? 

 

Does Commvault software on client need to check the signature on both partitions at different MA’s? If so, this explains this communication between the client and both MA's at the same time.

Or one MediaAgent consult each other? On this case, I should notice some traffic between these two Media Agent Servers. 

 

 

 

 

 

 

 

Ah I wasn't sure if you had partitions or not. Correct on comparing signatures being ‘checking’. In the case of partition dedupe, the client establishes a connection to each partition - so per your screenshot you’ll see connectivity from your client to both ma-20 and ma-21. There is an algorithm that splits queries against both databases - so for two partitions, they both contain 50% of the signatures. 

Media Agents communicate with each other during the data aging phase. One MA is designated the ‘master’ and is responsible for calculating and initiating data aging requests. Data aging is asynchronous so it is not strictly tied to the data aging job you may see running in the job controller. So when data aging is going on, your should see established network connections between the media agents sharing a DDB via partitions.

Userlevel 2
Badge +9

Damian, I still have some dumb quetions. 

“There is an algorithm that splits queries against both databases - so for two partitions, they both contain 50% of the signatures. “ 
 

I thought that there is algorithm that splits the inserts (the hashes), not the queries. How does the client know which partition to query during the backup with Source-Side (Client-Side) Deduplication? Does the client query both partitions or one MA (with Deduplication Database Role) consult each other? 

 

 

Reply