Issue with HazelCast for Node Initialization
We are unable to start two of the nodes in the cluster of 7 nodes .
I see the below article about the same issue and it says about WebSphere but Do we have any similar settings for tomcat application server to resolve this issue.
To resolve this problem, update the uLimit values in your WebSphere Application Server environment:
Log Information:-
com.pega.pegarules.pub.context.InitializationFailedError: PRNodeImpl init failed
Caused by:- com.pega.hazelcast.core.OperationTimeoutException: RegistrationOperation invocation failed to complete due to operation-heartbeat-timeout.
We are unable to start two of the nodes in the cluster of 7 nodes .
I see the below article about the same issue and it says about WebSphere but Do we have any similar settings for tomcat application server to resolve this issue.
To resolve this problem, update the uLimit values in your WebSphere Application Server environment:
Log Information:-
com.pega.pegarules.pub.context.InitializationFailedError: PRNodeImpl init failed
Caused by:- com.pega.hazelcast.core.OperationTimeoutException: RegistrationOperation invocation failed to complete due to operation-heartbeat-timeout.
2023-11-05 00:05:39,708 [ sesttst553] [ STANDARD] [ ] [ ] ( DistributedDiagnostics) ERROR - com.pega.hazelcast.core.OperationTimeoutException: RegistrationOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2023-11-05 00:05:39.704. Start time: 2023-11-05 00:03:38.927. Total elapsed time: 120777 ms. Last operation heartbeat: never. Last operation heartbeat from member: never. Invocation{op=com.pega.hazelcast.spi.impl.eventservice.impl.operations.RegistrationOperation{serviceName='null', identityHash=941701678, partitionId=-1, replicaIndex=0, callId=6, invocationTime=1699139018927 (2023-11-05 00:03:38.927), waitTimeout=-1, callTimeout=60000, tenantControl=com.pega.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=60000, firstInvocationTimeMs=1699139018927, firstInvocationTime='2023-11-05 00:03:38.927', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 01:00:00.000', target=[XX.XX.XX.XX]:5726, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=null}
2023-11-05 00:09:41,712 [ sesttst553] [ STANDARD] [ ] [ ] ( etier.impl.EngineStartup) ERROR - PegaRULES initialization failed. Server: sesttst553
com.pega.pegarules.pub.context.InitializationFailedError: PRNodeImpl init failed
at com.pega.pegarules.session.internal.mgmt.PREnvironment.getThreadAndInitialize(PREnvironment.java:436) ~[prprivate-session.jar:?]
at com.pega.pegarules.session.internal.PRSessionProviderImpl.getThreadAndInitialize(PRSessionProviderImpl.java:2169) ~[prprivate-session.jar:?]
at com.pega.pegarules.session.internal.engineinterface.etier.impl.EngineStartup.initEngine(EngineStartup.java:726) ~[prprivate-session.jar:?]
at com.pega.pegarules.session.internal.engineinterface.etier.impl.EngineImpl._initEngine_privact(EngineImpl.java:180) ~[prprivate-session.jar:?]
at com.pega.pegarules.session.internal.engineinterface.etier.impl.EngineImpl.doStartup(EngineImpl.java:152) ~[prprivate-session.jar:?]
at com.pega.pegarules.web.servlet.WebAppLifeCycleListener._contextInitialized_privact(WebAppLifeCycleListener.java:214) ~[prwebj2ee.jar:?]
at com.pega.pegarules.web.servlet.AbstractLifeCycleListener._contextInitialized_privact(AbstractLifeCycleListener.java:145) ~[prwebj2ee.jar:?]
at com.pega.pegarules.web.servlet.AbstractLifeCycleListener.contextInitialized(AbstractLifeCycleListener.java:76) ~[prwebj2ee.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
at com.pega.pegarules.internal.bootstrap.PRBootstrap.invokeMethod(PRBootstrap.java:388) ~[prbootstrap-8.8.1-161.jar:8.8.1-161]
at com.pega.pegarules.internal.bootstrap.PRBootstrap.invokeMethodPropagatingThrowable(PRBootstrap.java:430) ~[prbootstrap-8.8.1-161.jar:8.8.1-161]
at com.pega.pegarules.boot.internal.extbridge.AppServerBridgeToPega.invokeMethodPropagatingThrowable(AppServerBridgeToPega.java:225) ~[prbootstrap-api-8.8.1-161.jar:8.8.1-161]
at com.pega.pegarules.boot.internal.extbridge.AppServerBridgeToPega.invokeMethod(AppServerBridgeToPega.java:274) ~[prbootstrap-api-8.8.1-161.jar:8.8.1-161]
at com.pega.pegarules.internal.web.servlet.WebAppLifeCycleListenerBoot.contextInitialized(WebAppLifeCycleListenerBoot.java:92) ~[prbootstrap-api-8.8.1-161.jar:8.8.1-161]
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4768) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5230) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:726) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:698) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:696) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:690) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1889) ~[catalina.jar:9.0.65]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[tomcat-util.jar:9.0.65]
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) ~[?:1.8.0_202]
at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:583) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:473) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1618) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:423) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:366) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:946) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:835) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1396) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1386) ~[catalina.jar:9.0.65]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) ~[tomcat-util.jar:9.0.65]
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) ~[?:1.8.0_202]
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:919) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:265) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardService.startInternal(StandardService.java:432) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.65]
at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930) ~[catalina.jar:9.0.65]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[catalina.jar:9.0.65]
at org.apache.catalina.startup.Catalina.start(Catalina.java:772) ~[catalina.jar:9.0.65]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:345) ~[bootstrap.jar:9.0.65]
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:476) ~[bootstrap.jar:9.0.65]
Caused by: com.pega.hazelcast.core.OperationTimeoutException: RegistrationOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2023-11-05 00:09:41.710. Start time: 2023-11-05 00:07:40.741. Total elapsed time: 120969 ms. Last operation heartbeat: never. Last operation heartbeat from member: never. Invocation{op=com.pega.hazelcast.spi.impl.eventservice.impl.operations.RegistrationOperation{serviceName='null', identityHash=1411988002, partitionId=-1, replicaIndex=0, callId=18, invocationTime=1699139260741 (2023-11-05 00:07:40.741), waitTimeout=-1, callTimeout=60000, tenantControl=com.pega.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=60000, firstInvocationTimeMs=1699139260741, firstInvocationTime='2023-11-05 00:07:40.741', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 01:00:00.000', target=[XX.XX.XX.XX]:5726, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=null}
at com.pega.hazelcast.spi.impl.operationservice.impl.InvocationFuture.newOperationTimeoutException(InvocationFuture.java:194) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolve(InvocationFuture.java:136) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.AbstractInvocationFuture.unblockOtherNode(AbstractInvocationFuture.java:795) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.AbstractInvocationFuture.unblockAll(AbstractInvocationFuture.java:759) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.AbstractInvocationFuture.complete0(AbstractInvocationFuture.java:1235) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.AbstractInvocationFuture.complete(AbstractInvocationFuture.java:1219) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.operationservice.impl.Invocation.complete(Invocation.java:672) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.operationservice.impl.Invocation.detectAndHandleTimeout(Invocation.java:444) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.operationservice.impl.InvocationMonitor$MonitorInvocationsTask.run0(InvocationMonitor.java:327) ~[pega-hz-4.2.4.jar:?]
at com.pega.hazelcast.spi.impl.operationservice.impl.InvocationMonitor$FixedRateMonitorTask.run(InvocationMonitor.java:283) ~[pega-hz-4.2.4.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_202]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_202]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
@BRAHMESH@
Hello,
Are these 2 nodes new ones? or they stopped working?
Have you defined in prconfig the below parameter with all IPs of your nodes?
<env name="cluster/Hazelcast/members" value="Ips list coma separated"/>
Have you also defined all default ports you want to use and made sure the 2 nodes have no firewall issue to reach all IPs:port?
Default ports (in PEGA 8.6.x, might have changed in later version)
<env name="dsm/services/stream/pyBrokerPort" value="9091" />
<env name="dsm/services/stream/pyKeeperPort" value="2180" />
<env name="dsm/services/stream/pyJmxPort" value="9998" />
<env name="dsm/services/stream/pyPort" value="7002" />
Are you using embedded hazelcast or externalised one? V5 I think?
Regards
Anthony