We are running Pega 7.4 Marketing - NBAA App on tomcat server with Oracle DB, have about 40 nodes working in production. We deploy changes to Decision tables and proposition data required by business on daily basis.
Sporadically, we are seeing issues with older version of decision tables getting picked up on few nodes in production although the latest version exists on that node.
We believe this is due to a cache Sync issue, verified system pulse is running on all nodes and we have shell unix scripts validating the pulse is latest on each node.
Can anyone provide inputs as to how we can deal this? Thankyou!!
Can you please share us the Pega RULES log from the the nodes where you are seeing cache issues ? Also, specify the problematic decision table instances names. This issue requires in depth analysis to find the root cause. You can also create incident with Pega GCS to troubleshoot this issue further.
Request you to check the below things:-
Seems to be VTable cache issue.
Can you check if page passivation is disabled with below setting (by any chance):
This issue is sporadic and seen on Decision table updates migration to production. Decision table is setup with different Eligibility when rules for offer presentation. Old version of decision table getting picked up sometimes is the issue and sporadic.
Page passivation is neither disabled/enabled explicitly. Does it come enabled by default or should we have to declare explicitily for Vtable cache to work? How does the Vtable Cache work and how is it different from Rule Cache?
I observed in rule cache older versions of DT rule exists, is it valid because pulse agent should invalidate and remove the older versions, right? While in Vtable cache, DT rules doesnot exist.
FYI, we do not restart nodes for daily deployments of these updates, unless a code release happens which is every 3 weeks.
All nodes are in cluster and i do not see any hazlecast exceptions as such.