TY - JOUR
AU - Wiersch, Lisa
AU - Friedrich, Patrick
AU - Hamdan, Sami
AU - Komeyer, Vera
AU - Hoffstaedter, Felix
AU - Patil, Kaustubh R.
AU - Eickhoff, Simon B.
AU - Weis, Susanne
TI - Sex classification from functional brain connectivity: Generalization to multiple datasets
JO - Human brain mapping
VL - 45
IS - 6
SN - 1065-9471
CY - New York, NY
PB - Wiley-Liss
M1 - FZJ-2024-03325
SP - e26683
PY - 2024
N1 - Funding informationDeutsche Forschungsgemeinschaft (DFG),Grant/Award Numbers: 491111487,431549029; National Institute of MentalHealth, Grant/Award Number:R01-MH074457; the Helmholtz PortfolioTheme “Supercomputing and Modeling for theHuman Brain”; European Union's Horizon2020 Research and Innovation Programme,Grant/Award Number: 945539
AB - Machine learning (ML) approaches are increasingly being applied to neuroimaging data. Studies in neuroscience typically have to rely on a limited set of training data which may impair the generalizability of ML models. However, it is still unclear which kind of training sample is best suited to optimize generalization performance. In the present study, we systematically investigated the generalization performance of sex classification models trained on the parcelwise connectivity profile of either single samples or compound samples of two different sizes. Generalization performance was quantified in terms of mean across-sample classification accuracy and spatial consistency of accurately classifying parcels. Our results indicate that the generalization performance of parcelwise classifiers (pwCs) trained on single dataset samples is dependent on the specific test samples. Certain datasets seem to "match" in the sense that classifiers trained on a sample from one dataset achieved a high accuracy when tested on the respected other one and vice versa. The pwCs trained on the compound samples demonstrated overall highest generalization performance for all test samples, including one derived from a dataset not included in building the training samples. Thus, our results indicate that both a large sample size and a heterogeneous data composition of a training sample have a central role in achieving generalizable results.Keywords: big data; generalizability; machine learning; neuroimaging; resting‐state functional connectivity; sex classification.
LB - PUB:(DE-HGF)16
C6 - 38647035
UR - <Go to ISI:>//WOS:001206019500001
DO - DOI:10.1002/hbm.26683
UR - https://juser.fz-juelich.de/record/1026172
ER -