Question
Cognizant
Cognizant
IN
Cognizant
Posted: Nov 27, 2025
Last activity: Nov 27, 2025
Last activity: 27 Nov 2025 17:01 EST
How to process files using File Listener with parallel processing across multi-node setup for better performance?
We have a requirement where files are processed using a File Listener. Based on a property in the XML file:
- If it is the first instance for the day, a new case should be created.
- Subsequent files with the same property value should merge into the existing case.
Currently, when multiple files are placed at the same time, multiple cases are being created due to a race condition.
Constraints:
- We want to enable parallel processing and leverage a multi-node environment for better performance.
- The solution should be maintainable in production.
Questions:
- How can we avoid this race condition in Pega?
- Are there any alternative design patterns or best practices for handling such concurrency issues?
@Kuldeep GUPTA
You can avoid the race condition by introducing a “control” or mapping table that is keyed by your XML property + date and letting the database handle the uniqueness. Before creating a case, first try to insert a record into this table with that key (using Obj-Save with WriteNow / Commit). If the insert succeeds, you know you are the first one for that key today, so you safely create the new case and store the case ID in that control record. If the insert fails because the unique key already exists, it means another node/thread already created the case, so you simply read that control record, open the existing case using the stored case ID, and then merge the new file into it. This way you can keep the File Listener parallel and multi-node, but the actual “create or reuse case” decision is serialized at the database level. For maintainability, keep this logic in a single activity or data transform called from the listener, and avoid putting complex logic in the listener rule itself. As an alternative design, some teams also stage incoming files in a data table and let a Queue Processor / Job Scheduler pick them up and apply the same “get-or-create” logic, which makes monitoring and reprocessing easier. In all cases, the best practice is to rely on a unique database index or lock rather than just checking “if case exists” in memory, because those checks are not safe when many nodes process the same key at the same time.