Question
Virtusa Corp.
LK
Last activity: 15 Jun 2016 9:00 EDT
getting lot of alert for FTSIncrementalIndexer agent.
Hi I am getting lot of alert for FTSIncrementalIndexer agent. also observed this agent is enable for each and every node in our multi node environment
Question 1. Is there any reason why we get lot of alerts for this.
Question 2. Do we need to run this agent in each node or is it ok to run in agent node which we have only for agent.
Question 3. Is it necessary to run "SystemIndexer" agent in every node for multi node environment
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Pegasystems Inc.
IN
Hi Raja,
Basically, FTSIncrementalIndexer can run on all nodes. This takes care of keeping Rules, Data and Work indexes in sync across the nodes.
You can turn off SystemWorkIndexer and SystemIndexer agents on all nodes. In order to understand more about the alerts that you have received , i would request you to share the alert log file .
Regards,
Ravi
Virtusa Corp.
LK
Thanks Ravi
I am not able to attach the alert file here. how to do it
To attach, click 'use advanced editor' in your reply - it will show an attach file option
Pegasystems Inc.
IN
Hi Raja,
Please attach the Alert log file , so that i will analyse and update you my observation.
Regards,
Ravi
Virtusa Corp.
LK
find attached the log file
There is a bug in ftsindexer for 717 and 718 - it does not pass all required paramaters to the stored procedure used for 'get and lock next available item in queue' resulting in a table scan. If system is busy the table scan will of course take time, and the ftsindexer will "fall behind".
If you are on 717/718, open a support ticket for the "fts indexer hotfix".
If you are on 717, please review the indexes on the fts indexer queeue table and source code for stored procedure sppr_reservequeueitem_b -- the "order by" clause in particular. I believe that 717 shipped with a good 'sproc' and indexes that will work if the ftsindexer bug is patched.
As noted, the system indexer and system work indexer are legacy and should be disabled.
Virtusa Corp.
LK
Thanks Andrew. I will do
Pegasystems
IN
Note that SystemIndexer and SystemWorkIndexer will be in action in an upgraded system (not a fresh installation) till all indexes have been converted to Elastic Search. The user must explicitly re-index from the search landing page to create Elastic Search indices to get this conversion done. Till such time, search will be serviced by the older Apache Lucene library (used as the search library till 7.1.6) and SystemIndexer and SystemWorkIndexer are needed to keep those indices up to date.
Per my post, open an SR and get the hfix.
Of you look at the alerts, the call to the queue management stored procedure to fetch 'next item in queue' is wrong.
The third parameter in the sproc call is supposed to be the agent name. In the alerts, the third parameter is null.
With agent name missing, the sproc generates 'the wrong query' which does not match the index on the tale, causing a table scan and sort to find the next item in queue to process.
You have two choices to fix performance issues
a- get the hfix
b- change the indexes to accomodate the bug
DROP INDEX PEGADATA_CATALYST.PR_SYS_QUEFTS_INDEX2;
CREATE INDEX "PEGADATA_CATALYST"."PR_SYS_QUEFTS_INDEX2" ON "PEGADATA_CATALYST"."PR_SYS_QUEUE_FTSINDEXER" ("PYITEMSTATUS", PXOBJCLASS, "PYMINIMUMDATETIMEFORPROCESSING","PZINSKEY");
How "deep" is your ftsindexer queue (row count)?
How "behind" is the agent (select minimum(pxCommitDateTime) from pegadata_catalyst.pr_sys_queue_ftsindexer)?
If the agent is more than a few weeks behind, I suggest that you truncate the queue and manually build the indexes via the landing page, in addition to fixing the query via hfix or index change
Per my post, open an SR and get the hfix.
Of you look at the alerts, the call to the queue management stored procedure to fetch 'next item in queue' is wrong.
The third parameter in the sproc call is supposed to be the agent name. In the alerts, the third parameter is null.
With agent name missing, the sproc generates 'the wrong query' which does not match the index on the tale, causing a table scan and sort to find the next item in queue to process.
You have two choices to fix performance issues
a- get the hfix
b- change the indexes to accomodate the bug
DROP INDEX PEGADATA_CATALYST.PR_SYS_QUEFTS_INDEX2;
CREATE INDEX "PEGADATA_CATALYST"."PR_SYS_QUEFTS_INDEX2" ON "PEGADATA_CATALYST"."PR_SYS_QUEUE_FTSINDEXER" ("PYITEMSTATUS", PXOBJCLASS, "PYMINIMUMDATETIMEFORPROCESSING","PZINSKEY");
How "deep" is your ftsindexer queue (row count)?
How "behind" is the agent (select minimum(pxCommitDateTime) from pegadata_catalyst.pr_sys_queue_ftsindexer)?
If the agent is more than a few weeks behind, I suggest that you truncate the queue and manually build the indexes via the landing page, in addition to fixing the query via hfix or index change
{call PEGADATA_CATALYST.sppr_sys_reservequeueitem_b ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) } inserts: <PEGADATA_CATALYST.pr_sys_queue_ftsindexer> <System-Queue-FTSIncrementalIndexer> <<null>> <Scheduled> <7cfe012df1f2e779931101aff3c1d680> <<null>> <2016-05-11 00:20:05.2> <true> <<null>> <<out>> |
amazon.com
US
Is there a recommendation from Pega about whether FTS Indexing should be enabled in production or not?
Updated: 8 Jun 2016 22:19 EDT
FTS indexing should be enabled but in general, work- indexing should not be enabled without reviewing
- application volume
- application requirements
- fundamental nature of 'search' in this applications business. Are you searching by well known common fields, like pyID, customer/policy code/member/subscriber, provider, card #, etc or do you truly need to search every piece of text in every work type?
Keep in mind that FTS indexer for < 71ML9 needs a hotfix and possibly an index change to the ftsindexer queue table
Virtusa Corp.
LK
Hi Andrew
Thanks for all support.
I did installed hotfix and verified.
I do not get any alerts that I mention above.
However still nothing improved related to performance.
When I stop FTSIncrementalIndexer. system is fast and all transactions are under threshold.
when I start FTSIncrementalIndexer system is very slow and lot of DB time and DB commit time alerts are occurred. addition to that observed few Acquire DB connection alerts as well.
please help me to find out exactly why FTSIncrementalIndexer make slow other queries. is there any lock mechanism happens in DB side etc..
specially DB commit is very slow.
If this is an Oracle system please get an AWR report.
How many items are in the ftsindexer queue currently? How much space is the FTS indexer table using?
what indexes are defined on the FTS indexer table?
Do you have sufficient disk space and access to the actual FTS index tables for updates to be processed?
What alerts do you get with FTS indexer enabled?
Have observed that Oracle may fail to reclaim space from the fts indexer table when rows are deleted and performance will degrade as Oracle will scan a 'mostly empty' table. In such a situation it is best to truncate the fts indexer table and rebuild the full text indexes, and regularly schedule a defragementation command (alter table shrink space cascade) to ensure that table stays healthy
Virtusa Corp.
LK
If this is an Oracle system please get an AWR report.
attached
How many items are in the ftsindexer queue currently? How much space is the FTS indexer table using?
Currently 152959 but i truncated before execute
what indexes are defined on the FTS indexer table?
attached
Do you have sufficient disk space and access to the actual FTS index tables for updates to be processed?
yes
What alerts do you get with FTS indexer enabled?
DB Time Alerts
DB commit Time Alerts
Acquire DB Connection
There is something completely messed up about this database
A basic read using the primary key is taking 61,621.01 GETS PER EXEC (500MB of in-memory data access). Primary key reads shold take perhaps 6 GETS
>> select pzInsKey from PEGADATA_CATALYST.pr_sys_queue_ftsindexer where pzInsKey = :1 and pyItemStatus = :2 for update nowait
The query to fetch the next item from the FTS indexer queue is taking 80,710.41 GETS per exec (646MB of in-memory data access). With proper index in place this should be 10-20 GETS
>> select pzInskey from (Select pzInskey from PEGADATA_CATALYST.pr_sys_queue_ftsindexer where pyItemStatus = :1 and pxObjClass = :2 and pyMinimumDateTimeForProcessing <= :3 order by pyMinimumDateTimeForProcessing ASC) where rownum <= 10
The Queue Management stored procedure is consuming 87.92% of the CPU used by your Oracle instance as it is basically doing in-memory scanning of the entire table on every access call.
There are other big issues in this database -- AWR claims that it takes on average 5 seconds to update your work items, and 8 seconds to initially write the work item!
The DB has massive waits for latch: enqueue hash chains. I've never seen hash chains as a resource,
With regard to FTS indexer
There is something completely messed up about this database
A basic read using the primary key is taking 61,621.01 GETS PER EXEC (500MB of in-memory data access). Primary key reads shold take perhaps 6 GETS
>> select pzInsKey from PEGADATA_CATALYST.pr_sys_queue_ftsindexer where pzInsKey = :1 and pyItemStatus = :2 for update nowait
The query to fetch the next item from the FTS indexer queue is taking 80,710.41 GETS per exec (646MB of in-memory data access). With proper index in place this should be 10-20 GETS
>> select pzInskey from (Select pzInskey from PEGADATA_CATALYST.pr_sys_queue_ftsindexer where pyItemStatus = :1 and pxObjClass = :2 and pyMinimumDateTimeForProcessing <= :3 order by pyMinimumDateTimeForProcessing ASC) where rownum <= 10
The Queue Management stored procedure is consuming 87.92% of the CPU used by your Oracle instance as it is basically doing in-memory scanning of the entire table on every access call.
There are other big issues in this database -- AWR claims that it takes on average 5 seconds to update your work items, and 8 seconds to initially write the work item!
The DB has massive waits for latch: enqueue hash chains. I've never seen hash chains as a resource,
With regard to FTS indexer
a- get a query plan for the two queries listed above. Confirm that Oracle is ignoring the indexes
b- Update statistics for pr_sys_queueue_ftsindexer and then run query plan again. Did that make Oracle start to use the indexes?
c- Find out 'how far behind' you are -- select min(pxcommitdatetime) from pr_sys_queueue_ftsindexer.
d- have you had users complaining regarding missing search results? If you are far behind and users have not noticed, perhaps the application really does not need work search
Not sure if you should tune the indexes and wait for the agent to 'catch up' or truncate/rebuild/restart ... Let's first figure out why Oracle is behaving wrong first
With regard to the slow write performance and Latch Contention - engage a DBA immediately.
Virtusa Corp.
LK
a- get a query plan for the two queries listed above. Confirm that Oracle is ignoring the indexes
Attached
b- Update statistics for pr_sys_queueue_ftsindexer and then run query plan again. Did that make Oracle start to use the indexes?
no
c- Find out 'how far behind' you are -- select min(pxcommitdatetime) from pr_sys_queueue_ftsindexer.
attached
d- have you had users complaining regarding missing search results? If you are far behind and users have not noticed, perhaps the application really does not need work search
No. this is performance test environment and there are some execution script failed due to this
You are missing the FTS Indexer agent queue hotfix. The FTSIndexer agent is not passing agent name into the 'reservequeitem' stored procedure, so the procedure is generating a query to find next item to process using pxobjclass rather than agent name.
Per my earlier posts, you have two options
a- get the hotfix for the ftsindexer agent
b- change your indexes to accomodate the bug
* Drop index pegadata.PR_SYS_QUEUE_FTSINDEXER_INDEX2;
* Create index pegadata. PR_SYS_QUEUE_FTSINDEXER_INDEX2 on pegadata.pr_sys_queue_ftsindexer (pyitemstatus, pxobjclass, pyminimumdateTimeForProcessing, pzinskey
I was mistaken when I stated that the fts indexer agent was fixed in 719. I've discovered that the ftsindexer agent bug persists through 7.2 and it is not fixed until 7.21
Virtusa Corp.
LK
Thanks for the information Andrew.
I already installed hot fix - HFIX-25805
Still system is slow when we start FTSIncrementalIndexer
Given that the wrong query is running, installing HFIX-25805 did not work.
Either research and remediate the hfix problems, or change db indexes to accommodate the bug
Pegasystems
IN
Note HFIX-25805 will not fix the existing entries in the table. Any new entries added to the table, after the HFIX has been installed, will be added with the agent name and thus the index currently existing will get picked up. You might want to update all the existing entries with the agent name so that it is consistent with what the HFIX provides.