Question
Rabobank
NL
Last activity: 12 Jul 2024 10:55 EDT
Offline Pega simulations
Hello,
We are conducting an AB test to determine if adding web data to our existing Pega models, which generate propensity scores, increases the likelihood that a user will click on a banner. The AB test consists of:
• A Test (Baseline Model): Based on Pega propensity data alone. • B Test (Enhanced Model): Based on Pega propensity data combined with web data.
The objective is to test if the inclusion of web data positively impacts the propensity scores generated by the models.
We run these simulations within our own information factory (Databricks) with a gradient boosting algorithm as we thought this is the best way.
Specific Questions
Feasibility of Using Propensity Scores Alone: • Is it feasible and valid to reduce the training data to just the propensity scores and use them along with the web data, instead of using the full training dataset (predictors as features) ? • Would this approach provide reliable results, or is it better to use the complete training data (predictors) along with the web data?
Hello,
We are conducting an AB test to determine if adding web data to our existing Pega models, which generate propensity scores, increases the likelihood that a user will click on a banner. The AB test consists of:
• A Test (Baseline Model): Based on Pega propensity data alone. • B Test (Enhanced Model): Based on Pega propensity data combined with web data.
The objective is to test if the inclusion of web data positively impacts the propensity scores generated by the models.
We run these simulations within our own information factory (Databricks) with a gradient boosting algorithm as we thought this is the best way.
Specific Questions
Feasibility of Using Propensity Scores Alone: • Is it feasible and valid to reduce the training data to just the propensity scores and use them along with the web data, instead of using the full training dataset (predictors as features) ? • Would this approach provide reliable results, or is it better to use the complete training data (predictors) along with the web data?
Simulation Execution: • What is the best practice for running these simulations within or outside the Pega environment? • Are there recommended tools or methods within Pega or outside of Pega to efficiently test and compare these models? • How can we effectively monitor and measure the impact of adding web data on the propensity scores and overall model performance?
Best Practices and Recommendations: • Based on your experience, what are the best practices for integrating additional data sources (like web data) into Pega models? • Are there specific configuration settings or advanced features in Pega that we should leverage to optimize our models for this scenario. I hope that the above questions are clear.
Thanks in advance!