Hi All - we are having an issue where user load is not equally distributed on all pega server nodes by LTM. We are using F5 with prpc v7.1.9. I checked with LTM guys and they are saying that from their side virtual server shows that statistics on all pool members look pretty well balanced. Does anyone have any idea where could be the pain point?
***Updated by moderator: Lochan to update Categories***
How are you determining that load is not evenly distributed? Are you comparing number of requestors in the SMA? What is the load distribution logic used by F5? Round-robin, weighted round robin, current load?
If you compare number of requestors in the SMA and the actual distribution statistics from F5, there's always a perceived imbalance.
1. Due to server affinity the same user who is logged into one server will have all requests sent back to the same server.
2. If the distribution logic depends on current load (remember http requests are single request/responses connections not open connections that last the duration of the user session) then its quite possible for F5 to determine that one server (the one that has more logged in users) to be less heavily loaded, simply because those users aren't hitting the server as actively as the one with fewer logged in users.
3. Requestors in the SMA are not passivated for 1 or 2 hours. Which means that the number of requestors reported in the SMA could be requestors which haven't seen any user activity for a while(check last access time) but haven't reached the passivation threshold.
I would monitor this (gather statistics) over a period of time before concluding if there's a problem.
Hi Patrick - Thank you for your detailed response!
Please find the answers to your questions as follows:
How are you determining that load is not evenly distributed? Are you comparing number of requestors in the SMA?==>>Using SMA by looking at requestor count on each nodes and also by looking at hourly log-usage report.
What is the load distribution logic used by F5? Round-robin, weighted round robin, current load?==>>Load balancing method is set as "Observed Member"
we have 15 minutes of timeout period after which requestors get logged off if inactive for more than 15 minutes. Sometime we notice that one node has 20+ users while other node doesn even have 1 user. We are trying to look at log usage statistics for few days to get to know the root cause.
The Least Sessions method passes a new connection to the node that currently has the least number of persistent sessions. Use of this load balancing method requires that the virtual server reference a type of persistence profile that tracks persistence connections. An example of this type of persistence profile is the Source Address Affinity or the Universal profile type. The Least Sessions method works best in environments where the servers or other equipment that you are load balancing have similar capabilities. This is a dynamic load balancing method.