AES reports nodes as unavailable

Question

DavidM4833

Member since 2017

7 posts

Raytheon Company

Posted: Feb 28, 2018

Last activity: Nov 30, 2018

Posted: 28 Feb 2018 8:24 EST
Last activity: 30 Nov 2018 14:38 EST

Closed

Solved

AES reports nodes as unavailable

Report

Every Monday morning at 12AM our AES instance starts sending out alerts that certain nodes are unavailable

The alerts go in and out for various nodes throughout the day. The nodes are available ; but AES seems to think they aren't.

It magically stops at midnight of the same day and is fine for the rest of the week.

Starts again the following Monday. Have an active case open with Pega Support but wondering if anyone else has experienced the same thing.

This is for Dev/QA. Our Prod AES works fine

***Edited by Moderator Marissa to update platform capability tags****

To see attachments, please log in.

Pega Autonomic Event Services

Data Integration

Support Case Parallel

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Accepted Solution

Posted: 7 years ago

Updated: 7 years ago

Posted: 3 Apr 2018 6:44 EDT
Updated: 3 Apr 2018 6:45 EDT

DavidM4833

Raytheon Company

replied to WERDA

Report

Ended up that when we changed our AES tomcat instance connector from http to AJP ; the problem went away.

We previously had issues with other instances when using the http connector ; but had neglected to change the AES instance connector.

Thanks everyone for the replies

View reply inline

To see attachments, please log in.

Posted: 7 years ago

Posted: 1 Mar 2018 3:17 EST

YSudhakarReddy

JPMorgan Chase & Company

replied to DavidM4833

Report

Hi,

Haven't observed the issue, it seems to be there is connection lost between AES Server and nodes, please check the network logs once.

To see attachments, please log in.

Like (0)

Posted: 7 years ago

Posted: 1 Mar 2018 6:04 EST

Abhinav7

PEGA

replied to DavidM4833

Report

Hi,

Which version of AES are you using and It is deployed on which prpc version?Is this issue happens only for particular nodes or for any random nodes?

There might be issue with SOAP service.Is there any agent which runs weekly on monday around 12 am.Please share the logs there will be some error message present in it.

Thanks,

Abhinav

To see attachments, please log in.

Like (0)

Posted: 7 years ago

Posted: 1 Mar 2018 13:08 EST

WERDA

PEGA

replied to DavidM4833

Report

AES expects monitored nodes to send a HLTH0001 message every two minutes. The HLTH0001 messages are stored as PegaAES-Data-NodeStats objects and used to update PegaAES-Data-NodeHealth object. There is an agent that in turn checks node health and other data to assess if a node is healthy. If the 'last health' was received more than five minutes ago, AES assumes that a node has failed and triggers the health change notification.

So there are three main failure scenarios to consider

a- monitored node is not sending health status messages to AES

b- monitored node sends health status messages to AES but the messages never arrive at AES server

c- monitored node sends health status messages but the AES SOAP service fails and the messages do not get persisted.

Working backwards

a- check AES logs for times in question. Do you see exceptions? Do you see an odd pattern of PEGA0011 (slow service) alerts?

a,b- on monitored node(s) having issues, enable debugging for classes httpclient.wire.header and httpclient.wire.message.

(1) Look for HLTH0001 messages being POSTed to AES. are health status messages being sent?
(2) Look for the HTTP response to the POST. Did AES server acknowledge messages? Is status HTTP 200 (processed) or an error status?

Which Pega version is monitored? Is your AES integration implemented through customized prlogging.xml file or via the 'dynamic appender' landing page?

To see attachments, please log in.

Like (0)

Accepted Solution

Posted: 7 years ago

Updated: 7 years ago

Posted: 3 Apr 2018 6:44 EDT
Updated: 3 Apr 2018 6:45 EDT

DavidM4833

Raytheon Company

replied to WERDA

Report

Ended up that when we changed our AES tomcat instance connector from http to AJP ; the problem went away.

We previously had issues with other instances when using the http connector ; but had neglected to change the AES instance connector.

Thanks everyone for the replies

To see attachments, please log in.

Like (0)

Posted: 7 years ago

Posted: 6 Apr 2018 12:48 EDT

WERDA

PEGA

replied to DavidM4833

Report

Thanks for the update. Pretty odd - suggests that the http connectors hang up inside tomcat ... preventing AES from getting the message.

To see attachments, please log in.

Like (0)

Question

AES reports nodes as unavailable

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

AES reports nodes as unavailable

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.