In our CDH (Inbound) Implementation, the current setup is Ingesting Customer Data in Spine Table (XCAR) via a Batch process that runs every 90 mins. There is new Data Source that is planned to Ingest via Real-Time Data Flow using Kafka method.
The Destination data set type is Database Table (Spine) & Save option selected is Insert new and overwrite existing records. Looks like there is a limitation with this "Save option", as it overrides the existing "record" and does not persist any previous information. Since the attributes between Batch & Real-Time loads are different we want to ensure that information gets persisted regardless of the Data Ingestion method (Batch/Real-time) used.
We are exploring the option to use Activity as Destination instead of Data Set. however we do not know what the performance implications are.
If anyone has come across such situation and / or knows what are the possible ways to handle this optimally please share your thoughts. Thank you very in advance.