Handling Duplicate Case ID Errors After a Pega Upgrade
Introduction: After Pega upgrades are we are seeing issue involves the generation of duplicate case IDs, causing errors when committing cases to the database. This article explains into the details of a specific scenario following a Pega upgrade where case creation was hindered due to duplicate key exceptions.
Issue Overview: After upgrading Pega, users encountered an issue where cases could not be created due to duplicate key exceptions in the database. The problem was traced to how Case IDs were being generated. Upon investigation, it was discovered that the pyLastReservedID on the Data-UniqueID instance for the impacted case prefix was set to a lower value than the largest pyID for that prefix. This discrepancy led to duplicate key exceptions, preventing case creation.
Analysis and Remediation: Upon analysis, we identified the root of the issue is with the pyLastReservedID (which is responsible for keeping track of the last reserved case ID) was out of sync with the actual case IDs in the database. This misalignment caused the system to reuse case IDs that had already been committed to the database, resulting in the duplicate key exception.
To remediate this issue, we suggested resetting the pyLastReservedID to a value higher than the largest pyID currently available for the case prefix. This was achieved using an activity in Pega, followed by a rolling restart to ensure that all nodes retrieved the updated batch of IDs.
Potential Root Cause: We explored several possible causes for this issue. One hypothesis involves a migration that took place approximately one year before the upgrade. It’s possible that the Data-UniqueID instance was not migrated along with the cases, resulting in the ID misalignment after the upgrade. Additionally, in Pega version 23.1.3, fixes were implemented to address scenarios where duplicate key exceptions should have been thrown but were not in earlier releases, potentially masking the problem.
Conclusion: Upgrading Pega systems can introduce unexpected issues like duplicate case ID errors, but with careful analysis these problems can be resolved. In this case, resetting the pyLastReservedID and performing a rolling restart allowed for successful case creation, and the system is now being monitored to ensure long-term stability. This incident highlights the importance of thorough post-upgrade testing and the value of understanding how system components like Data-UniqueID interact with case creation mechanisms.
***Edited by Moderator Rupashree S. to add Capability tags***