Question
Areteans Technology Solutions
IN
Last activity: 23 Jul 2018 14:21 EDT
No of assignments created during data flow execution depends on Number of partitions in source data or Thread count and batch scalability factor?
I am bit confused on how data flow processing occurs, as in how many assignments are created during execution.
The data flow help suggests the following
"Specify the number of the Pega 7 Platform threads that are assigned to process running the data flows and the batch scalability factor to use idle threads for running the data flows.
For example, when the source of a data flow is divided into five partitions, the data flow run is divided into five assignments that can be processed simultaneously on separate threads if there are enough threads.
The number of available threads is calculated by multiplying the thread count by the number of nodes. With two nodes and five threads in the system, the data flow run uses five threads and five threads remain idle. After you set the batch scalability factor to two, all 10 threads are used to process five assignments.
-
Enter the number of threads.
Note: The number of threads for running data flows is the same across all decision data nodes that are configured for the Data Flow service.
-
Enter the batch scalability factor."
If you observe the Italic lines in the above Data flow help, it suggests no of assignments depends on the number of partitions.
I am bit confused on how data flow processing occurs, as in how many assignments are created during execution.
The data flow help suggests the following
"Specify the number of the Pega 7 Platform threads that are assigned to process running the data flows and the batch scalability factor to use idle threads for running the data flows.
For example, when the source of a data flow is divided into five partitions, the data flow run is divided into five assignments that can be processed simultaneously on separate threads if there are enough threads.
The number of available threads is calculated by multiplying the thread count by the number of nodes. With two nodes and five threads in the system, the data flow run uses five threads and five threads remain idle. After you set the batch scalability factor to two, all 10 threads are used to process five assignments.
-
Enter the number of threads.
Note: The number of threads for running data flows is the same across all decision data nodes that are configured for the Data Flow service.
-
Enter the batch scalability factor."
If you observe the Italic lines in the above Data flow help, it suggests no of assignments depends on the number of partitions.
But if you see the attached PNG file showing data flow settings, there it is mentioned that Number of assignments = No of nodes * Thread Count * Batch scalability factor.
So question is which one is correct and how actually data flow parallel processing happens and what is the role of partitions, node count, thread count and batch scalability factor?
Regards
Abhi