Question
Cognizant Technologies Solutions US
CA
Last activity: 25 Jul 2018 15:40 EDT
AES CPU Alerts
we recently migrated to new PegaAES v7.3.1, observed that lot of CPU critical alerts coming but when we checked same in other monitoring tools or directly in server CPU utilization is very minimal... even increased threshold to 85 fro critical still it bleaching..
someone please explain on what basis AES calculate CPU utilization... is it calculate based on appserver process CPU consumption or will consider overall CPU in box.
Thanks
Sree
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Accepted Solution
Pegasystems Inc.
US
Pega Platform does not actually have any alerts for CPU utilization. In the platform, the management daemon runs every two minutes and queries Java runtime environment for (a) total cpu usage of the process [in seconds] (b) number of CPU's available. Management daemon tracks process cpu usage and last time check. The daemon calculates CPU usage by
(current process CPU usage in seconds - previous process CPU usage in seconds) / (time elapsed since last measurement) / (number of processors) * 100%
Example - let's say at 12:00:00 the daemon observed total CPU usage of 512.25 seconds and 4 cpu's available. At 12:02:00 daemon observed CPU usage of 622.25 seconds. Daemon would calculate
(622.25 cpu secs - 512.25 cpu secs) / 120 clock seconds / 4 cpu * 100% --> 25% CPU -- 120 cpu seconds used in 120 seconds on a box that should be able to provide 4 cpu seconds per clock second.
Management daemon sends a HLTH0001 message to AES / PDC, which then evaluates if that CPU usage is to be considered critical, warning or normal. In AES you can override those thresholds at node, system or global level.
So - check the data. If Java runtime and operating system are not agreeing, see if your JVM and OS are properly hotfixed
Swedbank AB
SE
Hello Sree!
Do you observe any alert in the PegaALERT logs at the same time?
You could get more idea about this by corelating this with the logs.
Thanks,
Pawan
Cognizant Technologies Solutions US
CA
Hello Pawan,
I checked in Monitoring node PegaALERT.log file didn't see anything that correlates with high CPU utilization. Please understand all monitoring tools included in the servers am not seeing CPU hikes. Am looking how AES Calculates utilization.
Thanks
Pegasystems Inc.
PL
As it was suggested, please check Pega Rules and Pega Alert for more details.
You can use Pega-LogViewer, a Java Swing based tool to view PegaRULES Log and Alert files. It will help you to find issues.
You can download Pega-LogViewer at: https://github.com/pegasystems/pega-logviewer/releases
Cognizant Technologies Solutions US
CA
will this be possible to provide logviewer from different url...
Pegasystems Inc.
PL
Hi,
Pega Global Customer Support has made public, these 3 tools which could help you with diagnosing the root cause of a problem in your Pega application. These are the same tools Pega engineers use to help you troubleshoot problems with applications developed using Pega platform.
- Pega-LogViewer is a Java Swing based tool to view PegaRULES Log and Alert files in a table format. Find out more - https://github.com/pegasystems/pega-logviewer
- Pega-TracerViewer is a Java Swing based tool to view Pega trace xml files. Find out more - https://github.com/pegasystems/pega-tracerviewer
- Pega-AlertAnalyzer is a web based based tool to analyze Pega-RULES Alert log files. Find out more https://github.com/pegasystems/pega-alertanalyzer
Accepted Solution
Pegasystems Inc.
US
Pega Platform does not actually have any alerts for CPU utilization. In the platform, the management daemon runs every two minutes and queries Java runtime environment for (a) total cpu usage of the process [in seconds] (b) number of CPU's available. Management daemon tracks process cpu usage and last time check. The daemon calculates CPU usage by
(current process CPU usage in seconds - previous process CPU usage in seconds) / (time elapsed since last measurement) / (number of processors) * 100%
Example - let's say at 12:00:00 the daemon observed total CPU usage of 512.25 seconds and 4 cpu's available. At 12:02:00 daemon observed CPU usage of 622.25 seconds. Daemon would calculate
(622.25 cpu secs - 512.25 cpu secs) / 120 clock seconds / 4 cpu * 100% --> 25% CPU -- 120 cpu seconds used in 120 seconds on a box that should be able to provide 4 cpu seconds per clock second.
Management daemon sends a HLTH0001 message to AES / PDC, which then evaluates if that CPU usage is to be considered critical, warning or normal. In AES you can override those thresholds at node, system or global level.
So - check the data. If Java runtime and operating system are not agreeing, see if your JVM and OS are properly hotfixed
Cognizant Technologies Solutions US
CA
Hello Andy,
So, CPU status won't be accurate all the time.. we can skip configuring alerts.
I compared with server as well as application monitoring tools every where it says CPU utilization never reached above 2.5% of total CPU (in last 24 hours) but AES Still giving Warning alert.
Thanks
Pegasystems Inc.
US
CPU utilization as reported to AES is accurate to what the Java runtime reports.
What does the data in AES show?
If your monitoring tools report maximum CPU utilization of 2.5%, I question the monitoring tools. I'm not saying that the system is running high CPU, but it's equally unlikely that it runs so low all the time.
What is the hardware / virtual hardware platform? What operating system / virtual container? How many "CPU's" does the operating system report? How are your monitoring tools gathering data?
Do you have operating system access? Can you use ps or other commands to check the start-to-now cpu utilization (in seconds) of the java process and compare it to clock elapsed?
It's not urgent that the data sent to AES be fully accurate - or that you even let AES monitor CPU utilization [you can edit the decision table from the UI to always return "normal"] but the major discrepancy between what java and your other tools report is concerning.