| Hauptseite > Publikationsdatenbank > Systematic misestimation of machine learning performance in neuroimaging studies of depression > print |
| 001 | 892632 | ||
| 005 | 20230515091803.0 | ||
| 024 | 7 | _ | |a 10.1038/s41386-021-01020-7 |2 doi |
| 024 | 7 | _ | |a 0893-133X |2 ISSN |
| 024 | 7 | _ | |a 1740-634X |2 ISSN |
| 024 | 7 | _ | |a 2128/28282 |2 Handle |
| 024 | 7 | _ | |a altmetric:105599429 |2 altmetric |
| 024 | 7 | _ | |a 33958703 |2 pmid |
| 024 | 7 | _ | |a WOS:000647877800001 |2 WOS |
| 037 | _ | _ | |a FZJ-2021-02221 |
| 082 | _ | _ | |a 610 |
| 100 | 1 | _ | |a Flint, Claas |0 0000-0001-5164-8227 |b 0 |
| 245 | _ | _ | |a Systematic misestimation of machine learning performance in neuroimaging studies of depression |
| 260 | _ | _ | |a Basingstoke |c 2021 |b Nature Publishing Group |
| 336 | 7 | _ | |a article |2 DRIVER |
| 336 | 7 | _ | |a Output Types/Journal article |2 DataCite |
| 336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1626785232_8199 |2 PUB:(DE-HGF) |
| 336 | 7 | _ | |a ARTICLE |2 BibTeX |
| 336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID |
| 336 | 7 | _ | |a Journal Article |0 0 |2 EndNote |
| 520 | _ | _ | |a We currently observe a disconcerting phenomenon in machine learning studies in psychiatry: While we would expect larger samples to yield better results due to the availability of more data, larger machine learning studies consistently show much weaker performance than the numerous small-scale studies. Here, we systematically investigated this effect focusing on one of the most heavily studied questions in the field, namely the classification of patients suffering from Major Depressive Disorder (MDD) and healthy controls based on neuroimaging data. Drawing upon structural MRI data from a balanced sample of N = 1868 MDD patients and healthy controls from our recent international Predictive Analytics Competition (PAC), we first trained and tested a classification model on the full dataset which yielded an accuracy of 61%. Next, we mimicked the process by which researchers would draw samples of various sizes (N = 4 to N = 150) from the population and showed a strong risk of misestimation. Specifically, for small sample sizes (N = 20), we observe accuracies of up to 95%. For medium sample sizes (N = 100) accuracies up to 75% were found. Importantly, further investigation showed that sufficiently large test sets effectively protect against performance misestimation whereas larger datasets per se do not. While these results question the validity of a substantial part of the current literature, we outline the relatively low-cost remedy of larger test sets, which is readily available in most cases. |
| 536 | _ | _ | |a 525 - Decoding Brain Organization and Dysfunction (POF4-525) |0 G:(DE-HGF)POF4-525 |c POF4-525 |f POF IV |x 0 |
| 542 | _ | _ | |i 2021-05-06 |2 Crossref |u https://creativecommons.org/licenses/by/4.0 |
| 542 | _ | _ | |i 2021-05-06 |2 Crossref |u https://creativecommons.org/licenses/by/4.0 |
| 588 | _ | _ | |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de |
| 700 | 1 | _ | |a Cearns, Micah |0 0000-0002-3353-8566 |b 1 |
| 700 | 1 | _ | |a Opel, Nils |0 P:(DE-HGF)0 |b 2 |
| 700 | 1 | _ | |a Redlich, Ronny |0 P:(DE-HGF)0 |b 3 |
| 700 | 1 | _ | |a Mehler, David M. A. |0 P:(DE-HGF)0 |b 4 |
| 700 | 1 | _ | |a Emden, Daniel |0 P:(DE-HGF)0 |b 5 |
| 700 | 1 | _ | |a Winter, Nils R. |0 P:(DE-HGF)0 |b 6 |
| 700 | 1 | _ | |a Leenings, Ramona |0 P:(DE-HGF)0 |b 7 |
| 700 | 1 | _ | |a Eickhoff, Simon B. |0 P:(DE-Juel1)131678 |b 8 |
| 700 | 1 | _ | |a Kircher, Tilo |0 P:(DE-HGF)0 |b 9 |
| 700 | 1 | _ | |a Krug, Axel |0 0000-0002-0564-2497 |b 10 |
| 700 | 1 | _ | |a Nenadic, Igor |0 P:(DE-HGF)0 |b 11 |
| 700 | 1 | _ | |a Arolt, Volker |0 P:(DE-HGF)0 |b 12 |
| 700 | 1 | _ | |a Clark, Scott |0 P:(DE-HGF)0 |b 13 |
| 700 | 1 | _ | |a Baune, Bernhard T. |0 P:(DE-HGF)0 |b 14 |
| 700 | 1 | _ | |a Jiang, Xiaoyi |0 P:(DE-HGF)0 |b 15 |
| 700 | 1 | _ | |a Dannlowski, Udo |0 P:(DE-HGF)0 |b 16 |e Corresponding author |
| 700 | 1 | _ | |a Hahn, Tim |0 P:(DE-HGF)0 |b 17 |
| 773 | 1 | 8 | |a 10.1038/s41386-021-01020-7 |b Springer Science and Business Media LLC |d 2021-05-06 |n 8 |p 1510-1517 |3 journal-article |2 Crossref |t Neuropsychopharmacology |v 46 |y 2021 |x 0893-133X |
| 773 | _ | _ | |a 10.1038/s41386-021-01020-7 |0 PERI:(DE-600)2008300-2 |n 8 |p 1510-1517 |t Neuropsychopharmacology |v 46 |y 2021 |x 0893-133X |
| 856 | 4 | _ | |u h |
| 856 | 4 | _ | |u https://juser.fz-juelich.de/record/892632/files/s41386-021-01020-7-1.pdf |y OpenAccess |
| 909 | C | O | |o oai:juser.fz-juelich.de:892632 |p openaire |p open_access |p VDB |p driver |p dnbdelivery |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 8 |6 P:(DE-Juel1)131678 |
| 913 | 1 | _ | |a DE-HGF |b Key Technologies |l Natural, Artificial and Cognitive Information Processing |1 G:(DE-HGF)POF4-520 |0 G:(DE-HGF)POF4-525 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Decoding Brain Organization and Dysfunction |x 0 |
| 913 | 0 | _ | |a DE-HGF |b Key Technologies |l Decoding the Human Brain |1 G:(DE-HGF)POF3-570 |0 G:(DE-HGF)POF3-574 |3 G:(DE-HGF)POF3 |2 G:(DE-HGF)POF3-500 |4 G:(DE-HGF)POF |v Theory, modelling and simulation |x 0 |
| 914 | 1 | _ | |y 2021 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1030 |2 StatID |b Current Contents - Life Sciences |
| 915 | _ | _ | |a Creative Commons Attribution CC BY 4.0 |0 LIC:(DE-HGF)CCBY4 |2 HGFVOC |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0600 |2 StatID |b Ebsco Academic Search |
| 915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b NEUROPSYCHOPHARMACOL : 2015 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |
| 915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0110 |2 StatID |b Science Citation Index |
| 915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0111 |2 StatID |b Science Citation Index Expanded |
| 915 | _ | _ | |a OpenAccess |0 StatID:(DE-HGF)0510 |2 StatID |
| 915 | _ | _ | |a Peer Review |0 StatID:(DE-HGF)0030 |2 StatID |b ASC |
| 915 | _ | _ | |a IF >= 5 |0 StatID:(DE-HGF)9905 |2 StatID |b NEUROPSYCHOPHARMACOL : 2015 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0310 |2 StatID |b NCBI Molecular Biology Database |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1050 |2 StatID |b BIOSIS Previews |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Thomson Reuters Master Journal List |
| 920 | _ | _ | |l yes |
| 920 | 1 | _ | |0 I:(DE-Juel1)INM-7-20090406 |k INM-7 |l Gehirn & Verhalten |x 0 |
| 980 | _ | _ | |a journal |
| 980 | _ | _ | |a VDB |
| 980 | _ | _ | |a UNRESTRICTED |
| 980 | _ | _ | |a I:(DE-Juel1)INM-7-20090406 |
| 980 | 1 | _ | |a FullTexts |
| 999 | C | 5 | |a 10.1001/jama.2015.18421 |9 -- missing cx lookup -- |1 AM Darcy |p 551 - |2 Crossref |u Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. J Am Med Assoc. 2016;315:551–52. |t J Am Med Assoc |v 315 |y 2016 |
| 999 | C | 5 | |a 10.1002/wps.20297 |9 -- missing cx lookup -- |1 HA Eyre |p 21 - |2 Crossref |u Eyre HA, Singh AB, Reynolds C. Tech giants enter mental health. World Psychiatry. 2016;15:21–22. |t World Psychiatry |v 15 |y 2016 |
| 999 | C | 5 | |a 10.1016/j.neuron.2014.10.047 |9 -- missing cx lookup -- |1 JDE Gabrieli |p 11 - |2 Crossref |u Gabrieli JDE, Ghosh SS, Whitfield-Gabrieli S. Prediction as a humanitarian and pragmatic contribution from human cognitive neuroscience. Neuron. 2015;85:11–26. |t Neuron. |v 85 |y 2015 |
| 999 | C | 5 | |a 10.1126/science.aaa8415 |9 -- missing cx lookup -- |1 MI Jordan |p 255 - |2 Crossref |u Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science. 2015;349:255–60. |t Science. |v 349 |y 2015 |
| 999 | C | 5 | |a 10.1038/mp.2016.201 |9 -- missing cx lookup -- |1 T Hahn |p 37 - |2 Crossref |u Hahn T, Nierenberg AA, Whitfield-Gabrieli S. Predictive analytics in mental health: applications, guidelines, challenges and perspectives. Mol Psychiatry. 2017;22:37–43. |t Mol Psychiatry. |v 22 |y 2017 |
| 999 | C | 5 | |1 BA Johnston |y 2015 |2 Crossref |u Johnston BA, Steele JD, Tolomeo S, Christmas D, Matthews K. Structural MRI-based predictions in patients with treatment-refractory depression (TRD). PLoS One. 2015;10:1–16. |
| 999 | C | 5 | |a 10.1093/brain/aws084 |9 -- missing cx lookup -- |1 B Mwangi |p 1508 - |2 Crossref |u Mwangi B, Ebmeier KP, Matthews K, Douglas Steele J. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder. Brain. 2012;135:1508–21. |t Brain. |v 135 |y 2012 |
| 999 | C | 5 | |a 10.1002/gps.4262 |9 -- missing cx lookup -- |1 MJ Patel |p 1056 - |2 Crossref |u Patel MJ, Andreescu C, Price JC, Edelman KL, Reynolds CF, Aizenstein HJ. Machine learning approaches for integrating clinical and imaging features in late-life depression classification and response prediction. Int J Geriatr Psychiatry. 2015;30:1056–67. |t Int J Geriatr Psychiatry. |v 30 |y 2015 |
| 999 | C | 5 | |a 10.1016/j.biopsych.2017.09.032 |9 -- missing cx lookup -- |1 AH Neuhaus |p e81 - |2 Crossref |u Neuhaus AH, Popescu FC. Sample Size, Model Robustness, and Classification Accuracy in Diagnostic Multivariate Neuroimaging Analyses. Biol Psychiatry. 2018;84:e81–e82. |t Biol Psychiatry. |v 84 |y 2018 |
| 999 | C | 5 | |a 10.1016/j.neuroimage.2016.02.079 |9 -- missing cx lookup -- |1 MR Arbabshirani |p 137 - |2 Crossref |u Arbabshirani MR, Plis S, Sui J, Calhoun VD. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. Neuroimage. 2017;145:137–65. |t Neuroimage. |v 145 |y 2017 |
| 999 | C | 5 | |a 10.1109/34.75512 |9 -- missing cx lookup -- |1 S Raudys |p 252 - |2 Crossref |u Raudys S, Jain A. Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners. IEEE Trans Pattern Anal Mach Intell. 1991;13:252–64. |t IEEE Trans Pattern Anal Mach Intell |v 13 |y 1991 |
| 999 | C | 5 | |a 10.1186/1471-2288-14-137 |1 T van der Ploeg |9 -- missing cx lookup -- |2 Crossref |u van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol. 2014;14:137. |t BMC Med Res Methodol |v 14 |y 2014 |
| 999 | C | 5 | |a 10.1016/j.biopsych.2016.10.028 |9 -- missing cx lookup -- |1 J Kambeitz |p 330 - |2 Crossref |u Kambeitz J, Cabral C, Sacchet MD, Gotlib IH, Zahn R, Serpa MH, et al. Detecting Neuroimaging Biomarkers for Depression: A Meta-analysis of Multivariate Pattern Recognition Studies. Biol Psychiatry. 2017;82:330–38. |t Biol Psychiatry. |v 82 |y 2017 |
| 999 | C | 5 | |a 10.1016/j.neuroimage.2016.10.038 |9 -- missing cx lookup -- |1 G Varoquaux |p 166 - |2 Crossref |u Varoquaux G, Raamana PR, Engemann DA, Hoyos-Idrobo A, Schwartz Y, Thirion B. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. Neuroimage. 2017;145:166–79. |t Neuroimage. |v 145 |y 2017 |
| 999 | C | 5 | |a 10.31219/OSF.IO/UZEHJ |9 -- missing cx lookup -- |2 Crossref |u Hahn T, Ebner-Priemer U, Meyer-Lindenberg A Transparent Artificial Intelligence – A Conceptual Framework for Evaluating AI-based Clinical Decision Support Systems. OSF Prepr. 2019. 2019. https://doi.org/10.31219/OSF.IO/UZEHJ. |
| 999 | C | 5 | |a 10.1016/j.neuroimage.2017.06.061 |9 -- missing cx lookup -- |1 G Varoquaux |p 68 - |2 Crossref |u Varoquaux G. Cross-validation failure: small sample sizes lead to large error bars. Neuroimage. 2018;180:68–77. |t Neuroimage. |v 180 |y 2018 |
| 999 | C | 5 | |a 10.1038/npp.2015.86 |9 -- missing cx lookup -- |1 U Dannlowski |p 2510 - |2 Crossref |u Dannlowski U, Kugel H, Grotegerd D, Redlich R, Suchy J, Opel N, et al. NCAN cross-disorder risk variant is associated with limbic gray matter deficits in healthy subjects and major depression. Neuropsychopharmacology. 2015;40:2510–16. |t Neuropsychopharmacology. |v 40 |y 2015 |
| 999 | C | 5 | |a 10.1038/mp.2014.39 |9 -- missing cx lookup -- |1 U Dannlowski |p 398 - |2 Crossref |u Dannlowski U, Grabe HJ, Wittfeld K, Klaus J, Konrad C, Grotegerd D, et al. Multimodal imaging of a tescalcin (TESC)-regulating polymorphism (rs7294919)-specific effects on hippocampal gray matter structure. Mol Psychiatry. 2015;20:398–404. |t Mol Psychiatry. |v 20 |y 2015 |
| 999 | C | 5 | |a 10.1007/s00406-018-0943-x |9 -- missing cx lookup -- |2 Crossref |u Kircher T, Wöhr M, Nenadic I, Schwarting R, Schratt G, Alferink J, et al. Neurobiology of the major psychoses: a translational perspective on brain structure and function—the FOR2107 consortium. Eur Arch Psychiatry Clin Neurosci. 2018:1–14. |
| 999 | C | 5 | |2 Crossref |u Wittchen H-U, Wunderlich U, Gruschwitz S, Zaudig M SKID I. Strukturiertes Klinisches Interview für DSM-IV. Achse I: Psychische Störungen. Interviewheft und Beurteilungsheft. Eine deutschsprachige, erweiterte Bearb. d. amerikanischen Originalversion des SKID I. Göttingen: Hogrefe; 1997. |
| 999 | C | 5 | |a 10.1016/j.neuroimage.2018.01.079 |9 -- missing cx lookup -- |1 C Vogelbacher |p 450 - |2 Crossref |u Vogelbacher C, Möbius TWD, Sommer J, Schuster V, Dannlowski U, Kircher T, et al. The Marburg-Münster Affective Disorders Cohort Study (MACS): A quality assurance protocol for MR neuroimaging data. Neuroimage. 2018;172:450–460. |t Neuroimage. |v 172 |y 2018 |
| 999 | C | 5 | |1 F Pedregosa |y 2012 |2 Crossref |u Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2012;12:2825–30. |
| 999 | C | 5 | |a 10.1016/j.biopsych.2015.12.023 |9 -- missing cx lookup -- |1 AF Marquand |p 552 - |2 Crossref |u Marquand AF, Rezek I, Buitelaar J, Beckmann CF. Understanding heterogeneity in clinical cohorts using normative models: beyond case-control studies. Biol Psychiatry. 2016;80:552–61. |t Biol Psychiatry. |v 80 |y 2016 |
| 999 | C | 5 | |a 10.3389/fpsyt.2016.00050 |9 -- missing cx lookup -- |1 HG Schnack |p 1 - |2 Crossref |u Schnack HG, Kahn RS. Detecting neuroimaging biomarkers for psychiatric disorders: sample size matters. Front Psychiatry. 2016;7:1–12. |t Front Psychiatry |v 7 |y 2016 |
| 999 | C | 5 | |a 10.1016/j.jneumeth.2015.01.010 |9 -- missing cx lookup -- |1 E Combrisson |p 126 - |2 Crossref |u Combrisson E, Jerbi K. Exceeding chance level by chance: the caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy. J Neurosci Methods. 2015;250:126–36. |t J Neurosci Methods. |v 250 |y 2015 |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|