Pega Marketing Cluster - Startup/Shutdown errors
I am hosting a multi-node Pega Marketing environment in a non-Pega client cloud. I was able to successfully get a cluster of 8 nodes (see attached for node types).
There were numerous errors in the startup/shutdown and I need to know if these are expected or not.
1) When shutting down the nodes in the correct order, got this error on remaining nodes when the last decisioning node was removed
"2020-10-28 01:59:48,003 [ster3-reconnection-0] [ ] [ ] [ ] ( driver.core.ControlConnection) ERROR - [Control connection] Cannot connect to any host, scheduling retry in 30000 milliseconds"
2) When last stream node was removed, got multiple info messages of "Waiting for Kafka sync updates" on remaining nodes then got this exception before shutdown completed
"ERROR - Brokers=[{ id:1001, status:Online, prpcNodeId:backgroundnode1, controller:true, partitionsCount:382, underReplicatedPartitionsCount:380 } { id:1002, status:Offline, prpcNodeId:backgroundnode2, controller:false, partitionsCount:383, underReplicatedPartitionsCount:380 }], rollingRestartReady=false com.pega.dsm.dnode.api.StreamServiceException: Kafka syncing timeout: cluster state didn't change for 300000 ms"
I am hosting a multi-node Pega Marketing environment in a non-Pega client cloud. I was able to successfully get a cluster of 8 nodes (see attached for node types).
There were numerous errors in the startup/shutdown and I need to know if these are expected or not.
1) When shutting down the nodes in the correct order, got this error on remaining nodes when the last decisioning node was removed
"2020-10-28 01:59:48,003 [ster3-reconnection-0] [ ] [ ] [ ] ( driver.core.ControlConnection) ERROR - [Control connection] Cannot connect to any host, scheduling retry in 30000 milliseconds"
2) When last stream node was removed, got multiple info messages of "Waiting for Kafka sync updates" on remaining nodes then got this exception before shutdown completed
"ERROR - Brokers=[{ id:1001, status:Online, prpcNodeId:backgroundnode1, controller:true, partitionsCount:382, underReplicatedPartitionsCount:380 } { id:1002, status:Offline, prpcNodeId:backgroundnode2, controller:false, partitionsCount:383, underReplicatedPartitionsCount:380 }], rollingRestartReady=false com.pega.dsm.dnode.api.StreamServiceException: Kafka syncing timeout: cluster state didn't change for 300000 ms"
3) When restarting the nodes in the cluster, in the correct order, got this error when starting the first decisioning (aka Cassandra) node
2020-10-28 02:22:12,143 [ionChangeExecutor:41] [ STANDARD] [ ] [ ] (andraSessionCache$SessionProxy) ERROR - Statement could not be executed, because there was no available host com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /{ipaddressremoved}:9042 (com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency ONE (1 required but only 0 alive)))
After receiving each of these errors, I continued with the startup/shutdown and can confirm they happen every time. Can someone please confirm these errors are expected.