Question
Proximus
BE
Last activity: 13 Apr 2022 6:06 EDT
PEGA 8 & Hazelcast V4: issue with Cluster Start
Hello,
Since my applications are running with PEGA 8.6.1 with hazelcast V4, I'm facing issue with cluster restart (IBM Websphere) and this was very stable with PEGA 7.3.1 and previous hazelcast version.
For Prod I'm having minimum 4 nodes.
In non Prod, with 2 nodes, no issues.
In a 8 nodes cluster, I'm having most of the time 3 nodes not properly starting.
Restarting the 3 failing afterwards works fine.
Error:
Caused by: com.pega.hazelcast.spi.exception.CallerNotMemberException: Not Member
My prconfig is same as for PEGA 7.3.1 with which I was not facing issues with hazelcast:
New for PEGA8: <env name="cluster/hazelcast/v4/enabled" value="true" />
<env name="cluster/hazelcast/ports" value="5702-5720" /> <env name="cluster/Hazelcast/members" value="my list of servers, coma separated" />
When launching the cluster start, not all nodes start at exact same speed and for few ones, I'm getting
Hello,
Since my applications are running with PEGA 8.6.1 with hazelcast V4, I'm facing issue with cluster restart (IBM Websphere) and this was very stable with PEGA 7.3.1 and previous hazelcast version.
For Prod I'm having minimum 4 nodes.
In non Prod, with 2 nodes, no issues.
In a 8 nodes cluster, I'm having most of the time 3 nodes not properly starting.
Restarting the 3 failing afterwards works fine.
Error:
Caused by: com.pega.hazelcast.spi.exception.CallerNotMemberException: Not Member
My prconfig is same as for PEGA 7.3.1 with which I was not facing issues with hazelcast:
New for PEGA8: <env name="cluster/hazelcast/v4/enabled" value="true" />
<env name="cluster/hazelcast/ports" value="5702-5720" /> <env name="cluster/Hazelcast/members" value="my list of servers, coma separated" />
When launching the cluster start, not all nodes start at exact same speed and for few ones, I'm getting
ERROR - PegaRULES initialization failed. Server: myServer3 com.pega.pegarules.pub.context.InitializationFailedError: PRNodeImpl init failed ... Caused by: com.pega.pegarules.pub.PRRuntimeException: Method Invocation exception ... Caused by: java.lang.reflect.InvocationTargetException ... Caused by: com.pega.pegarules.pub.PRRuntimeException: PRRuntimeException at com.pega.platform.hazelcastv4.util.HazelcastOperation.perform(HazelcastOperation.java:69) ~[hazelcast.jar:?] at com.pega.platform.messaging.internal.hazelcastv4.HazelcastTopic.registerMessageHandler(HazelcastTopic.java:57) ~[messaging-hz.jar:?] ... Caused by: com.pega.hazelcast.spi.exception.CallerNotMemberException: Not Member! this: [myServer4]:5702, caller: [serverIP]:5702, partitionId: -1, operation: com.pega.hazelcast.spi.impl.eventservice.impl.operations.RegistrationOperation, service: null
My understanding is that some nodes try to connect to each other while they're not yet up or not considered as part of the same cluster.
---> Could this be linked to hazelcast discovery mode?
I saw below param in some post, but not sure final meaning of it:
<env name="cluster/hazelcast/multicast/enabled" value="false"/>
---> Is there something I can play with around nodes auto discovery?
in pr_sys_statusnodes there are only the 8 rows of my 8 servers of my cluster.
All my nodes are identified with the same system identifier.
----> Could you please help or guide, in how I can stabilise a basic cluster restart?
Thank you
Regards
Anthony
***Edited by Moderator Marije to add Capability tags***