TY - JOUR AU - Kuhles, Gianna AU - Hamdan, Sami AU - Heim, Stefan AU - Eickhoff, Simon B. AU - Patil, Kaustubh R. AU - Camilleri, Julia A. AU - Weis, Susanne TI - Pitfalls in using ML to predict cognitive function performance JO - Scientific reports VL - 15 IS - 1 SN - 2045-2322 CY - [London] PB - Springer Nature M1 - FZJ-2025-04364 SP - 37747 PY - 2025 N1 - This study was supported by the Deutsche Forschungsgemeinschaft (DFG, GE 2835/2–1, EI 816/16 − 1 and EI 816/21 − 1), the National Institute of Mental Health (R01-MH074457), the Helmholtz Portfolio Theme “Supercomputing and Modeling for the Human Brain”, the Virtual Brain Cloud (EU H2020, no. 826421) & the National Institute on Aging (R01AG067103). AB - Machine learning analyses are widely used for predicting cognitive abilities, yet there are pitfalls that need to be considered during their implementation and interpretation of the results. Hence, the present study aimed at drawing attention to the risks of erroneous conclusions incurred by confounding variables illustrated by a case example predicting executive function (EF) performance by prosodic features. Healthy participants (n = 231) performed speech tasks and EF tests. From 264 prosodic features, we predicted EF performance using 66 variables, controlling for confounding effects of age, sex, and education. A reasonable prediction performance was apparently achieved for EF variables of the Trail Making Test. However, in-depth analyses revealed indications of confound leakage, leading to inflated prediction accuracies, due to a strong relationship between confounds and targets. These findings highlight the need to control confounding variables in ML pipelines and caution against potential pitfalls in ML predictions. LB - PUB:(DE-HGF)16 DO - DOI:10.1038/s41598-025-24325-9 UR - https://juser.fz-juelich.de/record/1047524 ER -