Last activity: 30 Dec 2021 11:59 EST
CDH Community Event: Adaptive Modeling Lessons from the Field
The predictions functionality in Pega Customer Decision Hub makes it easier than ever to drop adaptive models into Next-Best-Action decision logic. Some work is still needed, however, to help clients leverage the best of what CDH has to offer while providing confidence in the results.
In this session, Cheri Gaudet, Lead Consultant, CXForward shares lessons from the field on how to guide customers through an adaptive modeling project. You'll also learn how you can add value to your project by using OOTB reports.
Watch a replay
Note that Q&A from the session have been posted as replies below. Please continue the discussion there!
View the presentation slide deck (attached)
@shiss Does the Negative Response captured automatically during the Response Timeout is written to the Interaction History as well OR only in Adaptive Model?
@shiss From 8.6 onwards, I see starting propensity calculation formula is not available. Are we following a different approach for new proposition where it takes some time to learn?
@shiss Is it always required to have a negative outcome recorded when multiple propositions are recommended and only 1 is selected?
@sundm It depends on the use case. But usually when you present multiple offers and only one gets clicked on, negatives are captured for the rest.
@shiss How does it change the behaviour when the negative responses are not recorded for the ones that are not selected?
@shiss Assume an inbound situation, if a customer sees 2 offers and clicks on only one, it means the other was not of their interest. So a negative gets captured. Otherwise, how could your model learn a negative behavior.
@shiss If conversion models' responses can take a long time, will the model not automatically timeout on response before it is actually received?
@shiss There will be a negative response if no positive comes back. But usually, conversion models require a longer time compared to click models just due to the nature of the procedure.
@shiss Is there an end to end modeling exercise in Pega Academy anywhere? I found bits and pieces, and I am looking for a complete CDH use case
@shiss In 8.5 and especially 8.6 the UI setting can be easily used to set up a conversion model. Some prior data science discussion can help identify whether a conversion modeling is the appropriate solution, benefits, cons etc. But the technical implementations are made much easier in 8.6.
@shiss Are there any performance benchmarks that can give insight on processing speed and throughput of adaptive models on Pega cloud? How do you measure it on Private cloud or On prem servers?
@shiss Not sure about that. It could involve different aspects. I would imagine what you are interested in is the strategy run-time which is on the decision node. another aspect is model node where models are created.
@shiss Usually we have up to several hundred potential predictors. It is always recommended to use predictors of various sources (demographic data, behavioral data, contextual data). That means, we COULD have up to several hundred but that does not mean we SHOULD have that many, It is better to have a good selection of predictors.
@shiss Is it a good idea to apply data transformations (feature engineering) over predictors data before they are fed into the model? In other words, will the ADM automatically handle that data transformation implicitly?
@shiss Some data transformations could be done in the strategy. Type conversion, duration extraction (from date time) and other instance-level transformation could potentially be done on strategy. More complex transformation (Imputing, etc.) should be done outside of the strategy. Some basic imputing could also be done in strategy too (like simply replacing a NULL value with 0 or things of that nature that do not require very sophisticated statistical calculations).
@sundm Pega ADM treats date/time fields as "duration" automatically (in recent releases). You can set the "treatment" for those predictors in the predictor section of the ADM rule. Missing value imputation is not usually done nor needed as missing values are treated separately anyway, and the ADM model execution does not impose any restrictions on the data it can process. The best example of feature engineering for ADM models is in pre-processing lists of items - e.g. providing the mean value, the number of items and such "flattening" operations. This should be done outside of the strategy. Going forward, CPD will be the place to define such aggregates.
@shiss Does Pega support ensemble Machine learning Algorithm like hard voting and Soft voting of multiple Algorithms?
@shiss A new generation of adaptive models will be available soon. These models are based on Gradient Boosting algorithms (a type of ensemble methods). The new algorithm was released in 8.6 but in lab mode. A newer version will be available soon.
@shiss In case even impressions are recorded as negative response and we record a no-click as another negative response, does it not mean it is recording incorrectly ?
@shiss How have you found the OOTB IH summaries that get auto-added to the models? I’ve heard these prove to be quite predictive.
@shiss IH aggregations are extremely powerful. The idea is simple, prior behavior of customers are potential predictors in defining their future behavior. The OOTB IH aggregation dimensions in Pega ADM models have been decided after rigorous testing.
@shiss Could you throw some light or use cases of other AI platforms like H2O and other ones leveraged with Pega CDH for clients?
@shiss Sometimes customers might need to use algorithms that are not supported by Pega. Or they might have access to data that are not provided to Pega engine so they would want to create models offline and bring into Pega.
@shiss Since ADM internally bins the data, it can handle large volumes and reduce dimensionality. However, in use cases where data volume is a lot and customer behavior could potentially change over time, memory settings could be used.
@sundm If the question is about the number of "records" be aware that this concept is not relevant to an online classifier like ADM is. The data is streamed in and the models (classifiers) are updated in real time. The "memory settings" of ADM refer to the relative weight placed on new responses (by default all responses are weighed equally but that can be changed) but have no impact at all on the physical memory used.
@shiss Is Image Processing supported in Pega? If not, is there a plan to incorporate this feature in the near future?
Updated: 22 Nov 2021 19:28 EST
@shiss There is no plan to incorporate this feature in a future release as of now. Using Sagemaker or GoogleML would be your best option.
@shiss If we want to collect customer behavior data to build predictive models, what it the best way to do it? Is the recommendation to use Pega capabilities to gather that data?
@shiss As of 8.5 we have a new feature to capture the data that passes through ADM. This provides an opportunity to explore the data as well as using it to make new models. This data collection will not just be the customer behavior data, it will include any data that was passed through the ADM rule as well as the response label.
@shiss Model context refers to the context in which the model is making a prediction. A context that includes a treatment means that an adaptive model exists for every treatment in the system. A context that includes only action means that a model exists for every action, which is used for all treatments associated with that action.
@shiss The control group that is configured on predictions allows you to calculate lift (for example in CTR) by providing a benchmark performance to compare against. Champion challenger functionality has been used to accomplish the same thing, but now it is much more straightforward to set up in the prediction.
@shiss How do you configure and establish a model for outbound channel when response comes after some days from the customer and propensity values are very low?
Updated: 29 Nov 2021 12:48 EST
@shiss Response timeout is a mechanism that provides a solution in outbound. The responses are queued for a certain timeframe, and then they are sent back to the models.
Extend the response timeout window to an appropriate length so that you can capture those later responses. If propensities are low across the board, that’s not completely unusual. Propensities are often very low numbers.
@shiss How can we convince the data scientists that our adaptive model is better than the predictive models? Adaptive models use Naive bayes algorithm which is an older algorithm compared to newer algorithms that we use in predictive models?
@shiss I don’t think it’s a question of which is better but rather how adaptive models can enhance what they are already doing with their predictive models. There may be good reason to use both. Additionally, the true value of Naive Bayes in Pega is its capability of learning on the fly. Not every off-the-shelf machine learning algorithm has the capability of turning into an online learner. Also, Naive Bayes in Pega is preceded with a data analysis step that prepares the data for the algorithm including taking care of correlated predictors.
@sundm True. To add to that:
- In addition to the algorithm aspect, models built off line often have access to a wider set of data - which is an advantage. However they typically do not have access to all the contextual data in Pega - and context typically is very important for customer behavior. In addition, they can get stale as they're typically developed on older snapshots of the data and need manual updates when new actions/treatments are developed.
- A good way to incorporate them is to include their scores in the XCAR and feed them as (parameterized) predictors to the ADM models. Sometimes they do add value for some specific issues or groups (e.g. acquisition of new customers - where we know little about the customers yet). But often they do not show the same results on real data as when they were developed off-line. - We're working on an even better online algorithm based on XGBoost. Available already like mentioned before but in lab mode.