Question
AGL Australia
AU
Last activity: 4 May 2023 21:42 EDT
Data flow partitions processing sequence
I have a campaign that processes 480 partitions in 60 threads. Usually system assigns partitions starting from 1 and progressing sequentially to the last partition as 480. But sometimes it starts randomly from 100 or 200 processing to 480 and then returning to process from 1.
Question is how does system determine this allocation? Or can we ensure that it always picks 1 first and then proceeds to 480.
The reason i want to control this is to improve performance. We have modified logic in a way that quickest few partitions are always at the end to increase parallel processing (since there are no more partitions left in the end so threads completing their partitions stay unused, whereas the last few threads continue for long). Whenever the processing starts from 1 and ends at 480 then i can clearly see the benefit but not otherwise and hence this question.