001049995 001__ 1049995
001049995 005__ 20251219202234.0
001049995 0247_ $$2doi$$a10.1038/s44220-025-00527-y
001049995 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-05709
001049995 037__ $$aFZJ-2025-05709
001049995 082__ $$a610
001049995 1001_ $$0P:(DE-Juel1)188257$$aKambeitz, Joseph$$b0$$eCorresponding author
001049995 245__ $$aThe empirical structure of psychopathology is represented in large language models
001049995 260__ $$aLondon$$bNature Publishing Group UK$$c2025
001049995 3367_ $$2DRIVER$$aarticle
001049995 3367_ $$2DataCite$$aOutput Types/Journal article
001049995 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1766153209_25560
001049995 3367_ $$2BibTeX$$aARTICLE
001049995 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001049995 3367_ $$00$$2EndNote$$aJournal Article
001049995 500__ $$aThe original studies analyzed in this work were supported by the National Institute of Mental Health (Grant R01MH112612) to J.S. and the Deutsche Forschungsgemeinschaft (DFG) ET 31/7-1 to U.E. K.V. was supported within the project SIMSUB (Grant 01GP2215) of the German Ministery of Research, Technology and Space (BMFTR). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
001049995 520__ $$aClinical assessment and scientific research in psychiatry are largely based on questionnaires that are used to assess psychopathology. The development of large language models (LLMs) offers a new perspective for analysis of the language and terminology on which these questionnaires are based. We used state-of-the-art LLMs to derive numerical representations (‘text embeddings’) of the semantic and sentiment content of items from established questionnaires for the assessment of psychopathology. We compared the pairwise associations between empirical data from cross-sectional studies and text embeddings to test whether the empirical structure of psychopathology can be reconstructed by LLMs. Across four large-scale datasets (n = 1,555, n = 1,099, n = 11,807 and n = 39,755), we found a range of significant correlations between empirical item-pair associations and associations derived from text embeddings (r = 0.18 to r = 0.57, all P < 0.05). Random forest regression models based on semantic or sentiment embeddings predicted empirical item-pair associations with moderate to high accuracy (r = 0.33 to r = 0.81, all P < 0.05). Similarly, empirical clustering of items and grouping to established subdomain scores could be partly reconstructed by text embeddings. Our results demonstrate that LLMs are able to represent substantial components of the empirical structure of psychopathology. Consequently, the integration of LLMs into mental health research has the potential to unlock numerous promising avenues. These may encompass improving the process of developing questionnaires, optimizing generalizability and reducing the redundancy of existing questionnaires or facilitating the development of new conceptualizations of mental disorders.
001049995 536__ $$0G:(DE-HGF)POF4-5251$$a5251 - Multilevel Brain Organization and Variability (POF4-525)$$cPOF4-525$$fPOF IV$$x0
001049995 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001049995 7001_ $$0P:(DE-HGF)0$$aSchiffman, Jason$$b1
001049995 7001_ $$0P:(DE-HGF)0$$aKambeitz-Ilankovic, Lana$$b2
001049995 7001_ $$0P:(DE-HGF)0$$aMittal, Vijay A.$$b3
001049995 7001_ $$00000-0002-0160-0281$$aEttinger, Ulrich$$b4
001049995 7001_ $$0P:(DE-Juel1)176404$$aVogeley, Kai$$b5$$ufzj
001049995 773__ $$0PERI:(DE-600)3123130-5$$a10.1038/s44220-025-00527-y$$gVol. 3, no. 12, p. 1482 - 1492$$n12$$p1482 - 1492$$tNature Mental Health$$v3$$x2731-6076$$y2025
001049995 8564_ $$uhttps://juser.fz-juelich.de/record/1049995/files/PDF.pdf$$yOpenAccess
001049995 909CO $$ooai:juser.fz-juelich.de:1049995$$popenaire$$popen_access$$pVDB$$pdriver$$pdnbdelivery
001049995 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176404$$aForschungszentrum Jülich$$b5$$kFZJ
001049995 9131_ $$0G:(DE-HGF)POF4-525$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5251$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vDecoding Brain Organization and Dysfunction$$x0
001049995 9141_ $$y2025
001049995 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001049995 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2024-12-20
001049995 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001049995 915__ $$0StatID:(DE-HGF)3003$$2StatID$$aDEAL Nature$$d2024-12-20$$wger
001049995 920__ $$lyes
001049995 9201_ $$0I:(DE-Juel1)INM-3-20090406$$kINM-3$$lKognitive Neurowissenschaften$$x0
001049995 980__ $$ajournal
001049995 980__ $$aVDB
001049995 980__ $$aUNRESTRICTED
001049995 980__ $$aI:(DE-Juel1)INM-3-20090406
001049995 9801_ $$aFullTexts