Question
Transport for NSW
AU
Last activity: 7 Aug 2023 10:35 EDT
Pega 8.4 kafka stability issues: CharlatanExceptions in logs
We recently upgraded to 8.4.1 from 7.2.2 and after that we have been constantly seeing lot of issues related to kafka on our Dev and SIT servers.
Recently we have done multi node setup in our test environments (2 separate servers) with default node classification (i.e Web, BackgroundProcessing, Search and Stream)
1) we started seeing strange issues like the below log which gets logged every 10 seconds.
We tried truncating PR_SYS_STATUSNODES and started the nodes and things looked fine for couple of days before surfacing again.
Stream status shows normal for both the nodes.
2) One more strange behaviour is that Admin studio is showing '0 nodes as running' though both the nodes are running fine and we're able to log into the applications on both the nodes.
The communication b/w both the nodes also seem to be working - I telnet'ed connecting to different ports used by pega from one node to other and the connection is established okay (unless I overlooked something).
Has anyone faced this and any help is really appreciated?
=================================
We recently upgraded to 8.4.1 from 7.2.2 and after that we have been constantly seeing lot of issues related to kafka on our Dev and SIT servers.
Recently we have done multi node setup in our test environments (2 separate servers) with default node classification (i.e Web, BackgroundProcessing, Search and Stream)
1) we started seeing strange issues like the below log which gets logged every 10 seconds.
We tried truncating PR_SYS_STATUSNODES and started the nodes and things looked fine for couple of days before surfacing again.
Stream status shows normal for both the nodes.
2) One more strange behaviour is that Admin studio is showing '0 nodes as running' though both the nodes are running fine and we're able to log into the applications on both the nodes.
The communication b/w both the nodes also seem to be working - I telnet'ed connecting to different ports used by pega from one node to other and the connection is established okay (unless I overlooked something).
Has anyone faced this and any help is really appreciated?
=================================
2020-08-11 22:35:55,215 [ New I/O worker #65] [ STANDARD] [ ] [ ] (ion.service.SessionServiceImpl) ERROR - Failed to accept in coming connection for the session '-1375665877' com.pega.charlatan.utils.CharlatanException$SessionExpiredException: KeeperErrorCode = Session expired at com.pega.charlatan.session.service.SessionServiceImpl.handleConnectRequest(SessionServiceImpl.java:129) ~[charlatan-server.jar:?] at com.pega.charlatan.session.service.SessionServiceImpl.processRequest(SessionServiceImpl.java:84) ~[charlatan-server.jar:?] at com.pega.charlatan.server.CharlatanNettyConnection.receiveMessage(CharlatanNettyConnection.java:78) ~[charlatan-server.jar:?] at com.pega.charlatan.server.CharlatanNettyServer$CharlatanChannelHandler.processMessage(CharlatanNettyServer.java:213) ~[charlatan-server.jar:?] at com.pega.charlatan.server.CharlatanNettyServer$CharlatanChannelHandler.messageReceived(CharlatanNettyServer.java:207) ~[charlatan-server.jar:?] at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) ~[netty-3.10.6.Final.jar:?] at com.pega.charlatan.server.CharlatanNettyServer$CharlatanChannelHandler.handleUpstream(CharlatanNettyServer.java:200) ~[charlatan-server.jar:?] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[netty-3.10.6.Final.jar:?] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ~[netty-3.10.6.Final.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_77] at com.pega.dsm.dnode.util.PrpcRunnable$1.run(PrpcRunnable.java:59) ~[d-node.jar:?] at com.pega.dsm.dnode.util.PrpcRunnable$1.run(PrpcRunnable.java:56) ~[d-node.jar:?] at com.pega.dsm.dnode.util.PrpcRunnable.execute(PrpcRunnable.java:67) ~[d-node.jar:?] at com.pega.dsm.dnode.impl.prpc.PrpcThreadFactory$PrpcThread.run(PrpcThreadFactory.java:124) ~[d-node.jar:?]
===============================
***Edited by Moderator Marissa to change type from General to Upgrade, update Platform Capability tags****