| Home > Publications database > Cofound-Leakage: Confound Removal In Machine Learning Leads To Leakage |
| Poster (After Call) | FZJ-2023-03045 |
; ; ; ; ; ;
2023
This record in other databases:
Please use a persistent id in citations: doi:10.34734/FZJ-2023-03045
Abstract: Modern Machine Learning (ML) approaches are now regularly employed forindividual-level prediction, e.g. personalized medicine.Particularly in such critical-decision making, it is of utmost importance to not onlyachieve high accuracy but also to trust that models rely on actual features-targetrelationships [1, 2]. To this end, it is crucial to consider confounding variables as theycan obstruct the features-target relationship. For instance, a researcher might wantto identify a biomarker showing high classification accuracy between controls andpatients. However, the model might have just learned simpler confounders like ageor sex as a good proxy of the disease [3]. To counteract such unwanted confoundingeffects, investigators often use linear models to remove confounding variables fromeach feature separately before employing ML. While this confound regression (CR)approach is popular [4], its pitfalls, especially when paired with non-linear MLmodels, are not well understood.
|
The record appears in these collections: |