Cofound-Leakage: Confound Removal In Machine Learning Leads To Leakage

Hamdan, Sami; Schwender, Holger; Eickhoff, Simon; Polier, Georg von; Patil, Kaustubh; Love, Bradley C.; Weis, Susanne
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Hamdan:1010405,
      author       = {Hamdan, Sami and Love, Bradley C. and Polier, Georg von and
                      Weis, Susanne and Schwender, Holger and Eickhoff, Simon and
                      Patil, Kaustubh},
      title        = {{C}ofound-{L}eakage: {C}onfound {R}emoval {I}n {M}achine
                      {L}earning {L}eads {T}o {L}eakage},
      reportid     = {FZJ-2023-03045},
      year         = {2023},
      note         = {Acknowledgments: This work was partly supported by the
                      Helmholtz-AI project DeGen, the Helmholtz Portfolio Theme
                      ‘Supercomputing and Modeling for the Human Brain’ and
                      Deutsche Forschungsgemeinschaft (DFG, German Research
                      Foundation) Poster: Pitfalls of Confound Regression in
                      Machine Learning. Der Postertitel lautet anders, doch das
                      war ok fuer OHBM Veranstalter},
      abstract     = {Modern Machine Learning (ML) approaches are now regularly
                      employed forindividual-level prediction, e.g. personalized
                      medicine.Particularly in such critical-decision making, it
                      is of utmost importance to not onlyachieve high accuracy but
                      also to trust that models rely on actual
                      features-targetrelationships [1, 2]. To this end, it is
                      crucial to consider confounding variables as theycan
                      obstruct the features-target relationship. For instance, a
                      researcher might wantto identify a biomarker showing high
                      classification accuracy between controls andpatients.
                      However, the model might have just learned simpler
                      confounders like ageor sex as a good proxy of the disease
                      [3]. To counteract such unwanted confoundingeffects,
                      investigators often use linear models to remove confounding
                      variables fromeach feature separately before employing ML.
                      While this confound regression (CR)approach is popular [4],
                      its pitfalls, especially when paired with non-linear
                      MLmodels, are not well understood.},
      month         = {Jul},
      date          = {2023-07-22},
      organization  = {Organization for Human Brain Mapping
                       (OHBM), Montreal (Canada), 22 Jul 2023
                       - 26 Jul 2023},
      subtyp        = {After Call},
      cin          = {INM-7},
      cid          = {I:(DE-Juel1)INM-7-20090406},
      pnm          = {5254 - Neuroscientific Data Analytics and AI (POF4-525) /
                      JL SMHB - Joint Lab Supercomputing and Modeling for the
                      Human Brain (JL SMHB-2021-2027)},
      pid          = {G:(DE-HGF)POF4-5254 / G:(DE-Juel1)JL SMHB-2021-2027},
      typ          = {PUB:(DE-HGF)24},
      doi          = {10.34734/FZJ-2023-03045},
      url          = {https://juser.fz-juelich.de/record/1010405},
}
Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe