Last activity: 30 Jun 2020 17:11 EDT
Repo File DataSet: How to know which Partition key went in which file while creating multiple files using wildcard
We have a requirement to create a file per partition key from a source table. We are able to figure out how to do that:
1) Define a dataset of the type database table and provide PartitionKey column
2) Destination dataset is Repo file DS, where file-path has "TestFile_*.csv.zip"
3) Set pyNumberOfRequestors (in RunOptions page for batch dataflow instance) equal to the number of partition/file we want.
The above setup provides us with an equal number of files to the number of the unique partition key.
So if we have 10 unique partition key then file generated will be:
TestFile_0001.csv.zip, . . .
TestFile_0009.csv.zip Now the question is: which partition key went to which file, how can we find mapping b/w partition key to a specific file name? So in the above example if 10 partition keys are:
Then how can I find out for example 'Key1' is inside which of those 10 files?