Confound Removal and Normalization in Practice: A Neuroimaging Based Sex Prediction Case Study

Dong, Yuxiao; Eickhoff, Simon B.; Patil, Kaustubh R.; Saunders, Craig; Van Hoecke, Sofie; More, Shammi; Caspers, Julian; Mladenić, Dunja; Ifrim, Georgiana
doi:10.1007/978-3-030-67670-4_1
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INBOOK{Dong:891374,
      author       = {More, Shammi and Eickhoff, Simon B. and Caspers, Julian and
                      Patil, Kaustubh R.},
      editor       = {Dong, Yuxiao and Ifrim, Georgiana and Mladenić, Dunja and
                      Saunders, Craig and Van Hoecke, Sofie},
      title        = {{C}onfound {R}emoval and {N}ormalization in {P}ractice: {A}
                      {N}euroimaging {B}ased {S}ex {P}rediction {C}ase {S}tudy},
      volume       = {12461},
      address      = {Cham},
      publisher    = {Springer International Publishing},
      reportid     = {FZJ-2021-01463},
      isbn         = {978-3-030-67669-8 (print)},
      series       = {Lecture Notes in Computer Science},
      pages        = {3 - 18},
      year         = {2021},
      comment      = {Machine Learning and Knowledge Discovery in Databases.
                      Applied Data Science and Demo Track / Dong, Yuxiao (Editor)
                      ; Cham : Springer International Publishing, 2021, Chapter 1
                      ; ISSN: 0302-9743=1611-3349 ; ISBN:
                      978-3-030-67669-8=978-3-030-67670-4 ;
                      doi:10.1007/978-3-030-67670-4},
      booktitle     = {Machine Learning and Knowledge
                       Discovery in Databases. Applied Data
                       Science and Demo Track / Dong, Yuxiao
                       (Editor) ; Cham : Springer
                       International Publishing, 2021, Chapter
                       1 ; ISSN: 0302-9743=1611-3349 ; ISBN:
                       978-3-030-67669-8=978-3-030-67670-4 ;
                       doi:10.1007/978-3-030-67670-4},
      abstract     = {Machine learning (ML) methods are increasingly being used
                      to predict pathologies and biological traits using
                      neuroimaging data. Here controlling for confounds is
                      essential to get unbiased estimates of generalization
                      performance and to identify the features driving
                      predictions. However, a systematic evaluation of the
                      advantages and disadvantages of available alternatives is
                      lacking. This makes it difficult to compare results across
                      studies and to build deployment quality models. Here, we
                      evaluated two commonly used confound removal schemes–whole
                      data confound regression (WDCR) and cross-validated confound
                      regression (CVCR)–to understand their effectiveness and
                      biases induced in generalization performance estimation.
                      Additionally, we study the interaction of the confound
                      removal schemes with Z-score normalization, a common
                      practice in ML modelling. We applied eight combinations of
                      confound removal schemes and normalization (pipelines) to
                      decode sex from resting-state functional MRI (rfMRI) data
                      while controlling for two confounds, brain size and age. We
                      show that both schemes effectively remove linear univariate
                      and multivariate confounding effects resulting in reduced
                      model performance with CVCR providing better generalization
                      estimates, i.e., closer to out-of-sample performance than
                      WDCR. We found no effect of normalizing before or after
                      confound removal. In the presence of dataset and confound
                      shift, four tested confound removal procedures yielded mixed
                      results, raising new questions. We conclude that CVCR is a
                      better method to control for confounding effects in
                      neuroimaging studies. We believe that our in-depth analyses
                      shed light on choices associated with confound removal and
                      hope that it generates more interest in this problem
                      instrumental to numerous applications.},
      cin          = {INM-7},
      cid          = {I:(DE-Juel1)INM-7-20090406},
      pnm          = {525 - Decoding Brain Organization and Dysfunction
                      (POF4-525) / DFG project 432015680 - Automatisierte
                      Gehirnalterung-Vorhersage und deren Interpretation},
      pid          = {G:(DE-HGF)POF4-525 / G:(GEPRIS)432015680},
      typ          = {PUB:(DE-HGF)7},
      UT           = {WOS:000716884800001},
      doi          = {10.1007/978-3-030-67670-4_1},
      url          = {https://juser.fz-juelich.de/record/891374},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help