Last activity: 19 May 2021 3:43 EDT
Predictor in adaptive models
Can we improve the performance of adaptive models by removing inactive predictors? As per I know, AUC is calculated from predictor bins. But how does inactive predictors contribute the bin.
Ex: As I know, predictor is strongly correlated to another predictor in the list. That means, in each bin there will be one strong correlated predictor and others will become inactive within the same group. How does it improve by model performance if I remove the inactive ones within the same group. Looks like I am missing something.
@Nizam Hi Nizam, can you explain a bit more of what you mean by "improve performance"? The execution of the adaptive models are very fast. If you are experiencing performance issues they are more likely to be related to the data loading steps and the network lag between systems than related to the strategy (and model execution).
@MarioBatres Thanks for responding back to my request. When I say "improve performance", this is more related predictive performance of adaptive model which is calculated in AUC. This is not the strategy performance or request performance. Is there any significant reason to remove inactive predictors? This was a question raised by our data science team. Just wanted to know what happens in the backend by removing inactive predictors? Thanks in advance.
As far as I know, inactive predictors do not affect the performance of the model. But I think in theory, if you remove some predictors; inactive and low-performing active predictors, the model can create new correlations, which may positively affect the performance.
@recepanilaydemir Thanks for responding back to my question. Just wanted to know how does new correlations are created by removing inactive predictors? In every predictor correlation group, there will be on top performing predictor. Even if we remove the inactive ones, the predictor group wont be changed. Please let me know if I am missing something.
Not just removing inactive ones, but if you remove low-performing active predictors also (your model probably has some active predictors which are really close to your threshold.), this may force the model to create new groups. I didn't try it; this is something that I plan to test.
What is the background of your question?
The model performance is calculated from the final classifier (the score distribution binning). Only active predictors contribute to the final classifier.
Removing inactive predictors should not have any material effect on model performance.
There are reasons to occasionally do review your predictors and see if there are (categories of) predictors that perform poorly across all the model instances.
There are also reasons to keep inactive predictors around and general advice is not to prune them too aggressively. This is mainly because behaviours change, population changes and the adaptive modeling system continually re-evaluates which predictors to make active.
Hope this helps
@Otto_Perdeck Thanks Otto. This is clear. Is there any article or a document to understand model performance calculation from the score distribution binning? Thanks in advance.
Hi @Nizam, I’ll need to see if we have canned answer or need to create one but assuming you know how AUC is calculated from probabilities and responses (e.g. https://en.wikipedia.org/wiki/Receiver_operating_characteristic), it’s just a matter of translating the counts of pos and neg in every bin to probabilities and responses.
@Otto_Perdeck Thanks for sharing the details.