Hi, in Pega MArketing we create Dataflows which have Decision Strategies. In the decision strategies properties we can specify how long the data can be kept in the Dnodes (Cassandra). we have 3 elements so far one set for 30minutes, one for 2 days and one for 5 days. After that they are no longer relevant.
What removes these from the cassandra DB when they expire?
Do they get removed or do they just stay there but as there older the logic wont use them?
How can we clean up data in cassandra and or manage it?
be interested if anyones looked into this area
***Updated by Moderator: Marissa to update categories***
>> What removes these from the cassandra DB when they expire?
>> Do they get removed or do they just stay there but as there older the logic won't use them?
Expired data is not removed immediately. It stays on disk and gradually gets removed by compaction process (new data overwrites old one). Even though data is on disk, it won't be used by a logic as it is marked as expired.
>> How can we clean up data in cassandra and or manage it?
If you want to make sure the old data is removed from disk, you can execute "compact" operation on each node from the D-Node landing page.