Data Page "Aggregate Sources" option: In Series, In Parallel & Performance considerations [LSA Data Excellence]
Are the Systems of Record referenced in an "Aggregate Sources" Data Page contacted in series or in parallel? What are the performance considerations?
As of Pega 8.4, data sources referenced in "Aggregate Sources" configuration of a Data Page are contacted in series in the order listed in the Data Page rule-form. To aggregate data from data sources in parallel, use a Data Source of type Activity which orchestrates the parallel invocation of each data source, ideally using Load-DataPage for each (assuming each distinct Data Source has its own Data Page) and Connect-Wait. These are referred to as Asynchronous Data Pages.
Functionally, if the results from Data Source A do not serve as inputs to Data Source B, then Data Page DP1 can reasonably load Data Source A and B in parallel. Rules invoked after the relevant Connect-Wait can then collate the results from each data source's Data Page into the results of DP1.
The performance considerations are informed by your application's Non-Functional Requirements; and the extent to which users (or time-sensitive background processing) will be blocked by the data sources being contacted in series.
Asynchronous Data Pages are loaded using threads from a dedicated thread pool. As such, there will be a physical limit to the extent of parallel processing that can be done on a node. You may never reach this limit, may not notice material impact when you do, and/or may be able to leverage the caching capabilities of Data Pages to reduce the number of threads consumed loading asynchronously from data sources.
Discussion on this topic was sought from the LSA Data Excellence (Pega 8.4) webinar conducted in July 2020. The webinar and its full set of discussions that arose from it are available at LSA Data Excellence: Webinar, Questions & Answers.