Solved

VSA Client not load balancing a group of proxies?

  • 17 November 2023
  • 11 replies
  • 162 views

Userlevel 2
Badge +10

I have a VSA subclient that has a Group defined for the proxies but when I watch the jobs it appears that only the first proxy in the group is being used.

 

Several other subclients with different proxy groups appear to be exhibiting the same behaviour.

This wouldn’t be so bad except that I ran into a problem where the first proxy in the group could not communicate but the backup kept trying to use that proxy instead of attemempting the job with another proxy in the group.

icon

Best answer by clecky 21 November 2023, 20:31

View original

11 replies

Userlevel 6
Badge +14

Hi @clecky 

Can you check this:

Best Regards,

Sebastien

Userlevel 2
Badge +10

@Sebastien Merluzzi 

 

Thanks, I am familiar with the proxy dispatch logic, I’m just unsure why the proxies aren’t obeying the logic. 

 

In other words all the proxies in the group are selected because they have identical datastore access cpu, and storage resources and network access, so Ideally they should have the backup load spread evenly across proxies but that does not seem to be the case.

 

I will dig around in the logs to find more information.

 

Thanks.

Chris.

Userlevel 6
Badge +14

Hi Chris,

Interesting, and do you get the same behaviour if you add the VSA Proxies instead of a Group please?

Best Regards,

Sebastien

Userlevel 2
Badge +10

I haven’t tried.

 

I will attempt that and see. 

Ideally we want to use groups to make adding and removing proxies possible with modifying the subclient and also to be able to create the subclients prior to having proxies available.

 

I’ll let you know what I find.

 

Userlevel 6
Badge +14

Sure, let me raise this with Development, I will get back to you.

Userlevel 6
Badge +14

Development is asking if you can log a case so we can review and escalate it to them.

Userlevel 7
Badge +23

The stream distribution (different from VM distribution) is described as follows:

 

  • The coordinator uses the memory and CPU information provided by proxies to calculate the number of streams each proxy can support.
  • By default, each CPU in a proxy can support 10 streams, and each stream requires 100 MB of memory.
  • Based on these proxy resources, the coordinator sets the stream limit for each proxy and allocates streams.
  • When distributing streams, the coordinator distributes streams to a proxy until the proxy reaches its limit (considering all processes and not just the current job). When a proxy reaches its limit, the coordinator does not assign any more streams to the proxy until all proxies have reached their resource limit.
  • When all proxies are at their resource limits, the coordinator assigns the remaining streams using round-robin distribution.

 

The highlight indicates that this is a fill and spill style config - i.e load up each proxy (Access node) before moving on to the next one. This seems consistent with what you are seeing. Usually though you’d see VMs get assigned based on the distribution logic, but since your proxies have equal weighting for all VMs, it looks like the stream distribution logic is being used solely.

Userlevel 2
Badge +10

 @Damian Andre 

I followed the suggestion of @Sebastien Merluzzi 

and created a ticket. 231120-718
 

it turns out that the culprit is new fancy firewalling that prevents the proxies from talking to each other even though they are in the same cluster.

Initially I just had a big pool of proxies defined at the instance level and I found that the only proxies that  would back up are those in the same cluster as the proxy coordinator.

so I split them up per cluster and it seemed to be working but it turns out all that did was make it so only the proxy coordinators will work.

awesome. 
 

that said thanks for all the help.

i have some firewall forms to submit.

 

 

Userlevel 2
Badge +10

I do have one additional question do I only need to open up 8400 between proxies?

Userlevel 6
Badge +14

And the Firewall port, looks like you are using 8403.

Userlevel 2
Badge +10

Awesome. Thanks again.

Reply