Tracing and activity and recovering from fail
I'm trying to understand how Pega Tracer implements a lock when tracing a particular activity. I have a developer who has had an issue with a trace error and when the trace fails due to the tracer session not closing or terminated properly (due to session getting timed out or due to getting stuck for some reason), it doesn’t open again. Even switching to another node doesn't allow the user to start tracer on the rule in question. Additionally, terminating the requestor via System->Operations-> Requestor Management. We've experimented with Activity rules to see this behavior. The error message is formatted as:
"Tracer session error: This rule <rule key> is being traced by operator <operator ID> from requestor <requestor ID> - Please restart tracer."
I was able to duplicate the error received by opening a two sessions in different nodes trying to trace the same activity. The second trace attempt fails with the error message. I assume this has something to do with the architecture of how the tracer "traces" and spits the data back to a user. If I navigate to System->Operations-> Requestor Management and execute a Terminate Tracer on the first requestor from the second requestors session, I am able to pick up and trace with the second session.
(Note: I did not replicate the exact scenario my developer was seeing with a tracer fail ... this may make a difference in the ability to select Terminate Tracer, or if it's an option at all. I'm unclear what they'd be able to see in this scenario for options).
I'm trying to understand how Pega Tracer implements a lock when tracing a particular activity. I have a developer who has had an issue with a trace error and when the trace fails due to the tracer session not closing or terminated properly (due to session getting timed out or due to getting stuck for some reason), it doesn’t open again. Even switching to another node doesn't allow the user to start tracer on the rule in question. Additionally, terminating the requestor via System->Operations-> Requestor Management. We've experimented with Activity rules to see this behavior. The error message is formatted as:
"Tracer session error: This rule <rule key> is being traced by operator <operator ID> from requestor <requestor ID> - Please restart tracer."
I was able to duplicate the error received by opening a two sessions in different nodes trying to trace the same activity. The second trace attempt fails with the error message. I assume this has something to do with the architecture of how the tracer "traces" and spits the data back to a user. If I navigate to System->Operations-> Requestor Management and execute a Terminate Tracer on the first requestor from the second requestors session, I am able to pick up and trace with the second session.
(Note: I did not replicate the exact scenario my developer was seeing with a tracer fail ... this may make a difference in the ability to select Terminate Tracer, or if it's an option at all. I'm unclear what they'd be able to see in this scenario for options).
I can't see a lock in pr_sys_locks, so where is the lock being held if at all in session/memory/database?
My questions
Is there any information to confirm the tracer architecture and why you can only trace an activity from one requestor at a time?
Does it make a difference when you terminate the requestor FIRST (and are unable to Terminate Trace as it is no longer an option)? Does this somehow leave you to waiting for the system to release the trace lock on it's own?
If terminating the trace AND terminating the requestor both fail to release the lock, is the only other option to restart the node?
Where is the lock being held and can it be viewed using any Pega or database viewing tools?