Large language models surpass human experts in predicting neuroscience results

Luo, Xiaoliang; Niso, Guiomar; Gu, Nianlong; Patil, Kaustubh R.; Okalova, Tereza; Sucholutsky, Ilia; Ales, Justin M.; Hall, Chloe M.; Gaebler, Michael; Nejad, Kevin K.; Khona, Mikail; Bizley, Jennifer K.; Musslick, Sebastian; Loued-Khenissi, Leyla; Ratan Murty, N. Apurva; Pashkov, Anton; Salatiello, Alessandro; Marinazzo, Daniele; Yilmaz, Bati; Rechardt, Akilles; Yusifov, Elkhan; Razavi, Sepehr; Bilgin, Isil Poyraz; Minervini, Pasquale; Love, Bradley C.; Lee, Pui-Shee; Yáñez, Felipe; Rocca, Roberta; Borghesani, Valentina; Behler, Anna; Bao, Sherry Dongqi; Mata, Rui; Lee, Kangjoo; Sun, Guangzhi; Myers, Nicholas E.; Ferianc, Martin; Dafflon, Jessica; Nicholas, Jonathan; Cohen, Alexandra O.

doi:10.1038/s41562-024-02046-9

Items
Marc 21

001			1041321
005			20260203123809.0
024	7	_	\|a 10.1038/s41562-024-02046-9 \|2 doi
024	7	_	\|a 10.34734/FZJ-2025-02220 \|2 datacite_doi
024	7	_	\|a 39604572 \|2 pmid
024	7	_	\|a WOS:001365146700001 \|2 WOS
037	_	_	\|a FZJ-2025-02220
082	_	_	\|a 150
100	1	_	\|a Luo, Xiaoliang \|0 P:(DE-HGF)0 \|b 0 \|e Corresponding author
245	_	_	\|a Large language models surpass human experts in predicting neuroscience results
260	_	_	\|a London \|c 2025 \|b Nature Research
336	7	_	\|a article \|2 DRIVER
336	7	_	\|a Output Types/Journal article \|2 DataCite
336	7	_	\|a Journal Article \|b journal \|m journal \|0 PUB:(DE-HGF)16 \|s 1744792024_17200 \|2 PUB:(DE-HGF)
336	7	_	\|a ARTICLE \|2 BibTeX
336	7	_	\|a JOURNAL_ARTICLE \|2 ORCID
336	7	_	\|a Journal Article \|0 0 \|2 EndNote
520	_	_	\|a cientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. Here, to evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs indicated high confidence in their predictions, their responses were more likely to be correct, which presages a future where LLMs assist humans in making discoveries. Our approach is not neuroscience specific and is transferable to other knowledge-intensive endeavours.
536	_	_	\|a 5254 - Neuroscientific Data Analytics and AI (POF4-525) \|0 G:(DE-HGF)POF4-5254 \|c POF4-525 \|f POF IV \|x 0
588	_	_	\|a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700	1	_	\|a Rechardt, Akilles \|0 P:(DE-HGF)0 \|b 1
700	1	_	\|a Sun, Guangzhi \|0 P:(DE-HGF)0 \|b 2
700	1	_	\|a Nejad, Kevin K. \|0 P:(DE-HGF)0 \|b 3
700	1	_	\|a Yáñez, Felipe \|0 P:(DE-HGF)0 \|b 4
700	1	_	\|a Yilmaz, Bati \|0 P:(DE-HGF)0 \|b 5
700	1	_	\|a Lee, Kangjoo \|0 P:(DE-HGF)0 \|b 6
700	1	_	\|a Cohen, Alexandra O. \|0 P:(DE-HGF)0 \|b 7
700	1	_	\|a Borghesani, Valentina \|0 P:(DE-HGF)0 \|b 8
700	1	_	\|a Pashkov, Anton \|0 P:(DE-HGF)0 \|b 9
700	1	_	\|a Marinazzo, Daniele \|0 P:(DE-HGF)0 \|b 10
700	1	_	\|a Nicholas, Jonathan \|0 P:(DE-HGF)0 \|b 11
700	1	_	\|a Salatiello, Alessandro \|0 P:(DE-HGF)0 \|b 12
700	1	_	\|a Sucholutsky, Ilia \|0 P:(DE-HGF)0 \|b 13
700	1	_	\|a Minervini, Pasquale \|0 P:(DE-HGF)0 \|b 14
700	1	_	\|a Razavi, Sepehr \|0 P:(DE-HGF)0 \|b 15
700	1	_	\|a Rocca, Roberta \|0 P:(DE-HGF)0 \|b 16
700	1	_	\|a Yusifov, Elkhan \|0 P:(DE-HGF)0 \|b 17
700	1	_	\|a Okalova, Tereza \|0 P:(DE-HGF)0 \|b 18
700	1	_	\|a Gu, Nianlong \|0 P:(DE-HGF)0 \|b 19
700	1	_	\|a Ferianc, Martin \|0 P:(DE-HGF)0 \|b 20
700	1	_	\|a Khona, Mikail \|0 P:(DE-HGF)0 \|b 21
700	1	_	\|a Patil, Kaustubh R. \|0 P:(DE-Juel1)172843 \|b 22
700	1	_	\|a Lee, Pui-Shee \|0 P:(DE-HGF)0 \|b 23
700	1	_	\|a Mata, Rui \|0 P:(DE-HGF)0 \|b 24
700	1	_	\|a Myers, Nicholas E. \|0 P:(DE-HGF)0 \|b 25
700	1	_	\|a Bizley, Jennifer K. \|0 P:(DE-HGF)0 \|b 26
700	1	_	\|a Musslick, Sebastian \|0 P:(DE-HGF)0 \|b 27
700	1	_	\|a Bilgin, Isil Poyraz \|0 P:(DE-HGF)0 \|b 28
700	1	_	\|a Niso, Guiomar \|0 P:(DE-HGF)0 \|b 29
700	1	_	\|a Ales, Justin M. \|0 P:(DE-HGF)0 \|b 30
700	1	_	\|a Gaebler, Michael \|0 P:(DE-HGF)0 \|b 31
700	1	_	\|a Ratan Murty, N. Apurva \|0 P:(DE-HGF)0 \|b 32
700	1	_	\|a Loued-Khenissi, Leyla \|0 P:(DE-HGF)0 \|b 33
700	1	_	\|a Behler, Anna \|0 P:(DE-HGF)0 \|b 34
700	1	_	\|a Hall, Chloe M. \|0 P:(DE-HGF)0 \|b 35
700	1	_	\|a Dafflon, Jessica \|0 P:(DE-HGF)0 \|b 36
700	1	_	\|a Bao, Sherry Dongqi \|0 P:(DE-HGF)0 \|b 37
700	1	_	\|a Love, Bradley C. \|0 P:(DE-HGF)0 \|b 38
773	_	_	\|a 10.1038/s41562-024-02046-9 \|g Vol. 9, no. 2, p. 305 - 315 \|0 PERI:(DE-600)2885046-4 \|n 2 \|p 305 - 315 \|t Nature human behaviour \|v 9 \|y 2025 \|x 2397-3374
856	4	_	\|u https://juser.fz-juelich.de/record/1041321/files/s41562-024-02046-9.pdf \|y OpenAccess
909	C	O	\|o oai:juser.fz-juelich.de:1041321 \|p openaire \|p open_access \|p VDB \|p driver \|p dnbdelivery
910	1	_	\|a Department of Experimental Psychology, University College London \|0 I:(DE-HGF)0 \|b 0 \|6 P:(DE-HGF)0
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 22 \|6 P:(DE-Juel1)172843
913	1	_	\|a DE-HGF \|b Key Technologies \|l Natural, Artificial and Cognitive Information Processing \|1 G:(DE-HGF)POF4-520 \|0 G:(DE-HGF)POF4-525 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Decoding Brain Organization and Dysfunction \|9 G:(DE-HGF)POF4-5254 \|x 0
914	1	_	\|y 2025
915	_	_	\|a Creative Commons Attribution CC BY 4.0 \|0 LIC:(DE-HGF)CCBY4 \|2 HGFVOC
915	_	_	\|a DEAL Nature \|0 StatID:(DE-HGF)3003 \|2 StatID \|d 2024-12-05 \|w ger
915	_	_	\|a WoS \|0 StatID:(DE-HGF)0113 \|2 StatID \|b Science Citation Index Expanded \|d 2024-12-05
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0160 \|2 StatID \|b Essential Science Indicators \|d 2024-12-05
915	_	_	\|a National-Konsortium \|0 StatID:(DE-HGF)0430 \|2 StatID \|d 2025-11-12 \|w ger
915	_	_	\|a JCR \|0 StatID:(DE-HGF)0100 \|2 StatID \|b NAT HUM BEHAV : 2022 \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0200 \|2 StatID \|b SCOPUS \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0300 \|2 StatID \|b Medline \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0199 \|2 StatID \|b Clarivate Analytics Master Journal List \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)1180 \|2 StatID \|b Current Contents - Social and Behavioral Sciences \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0130 \|2 StatID \|b Social Sciences Citation Index \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0150 \|2 StatID \|b Web of Science Core Collection \|d 2025-11-12
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)1030 \|2 StatID \|b Current Contents - Life Sciences \|d 2025-11-12
915	_	_	\|a IF >= 25 \|0 StatID:(DE-HGF)9925 \|2 StatID \|b NAT HUM BEHAV : 2022 \|d 2025-11-12
920	1	_	\|0 I:(DE-Juel1)INM-7-20090406 \|k INM-7 \|l Gehirn & Verhalten \|x 0
980	_	_	\|a journal
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a I:(DE-Juel1)INM-7-20090406
980	1	_	\|a FullTexts

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe