001024830 001__ 1024830 001024830 005__ 20250203103155.0 001024830 0247_ $$2doi$$a10.1016/j.patrec.2022.12.010 001024830 0247_ $$2ISSN$$a0167-8655 001024830 0247_ $$2ISSN$$a1872-7344 001024830 0247_ $$2datacite_doi$$a10.34734/FZJ-2024-02496 001024830 0247_ $$2pmid$$a37915616 001024830 0247_ $$2WOS$$aWOS:000935348300001 001024830 037__ $$aFZJ-2024-02496 001024830 082__ $$a004 001024830 1001_ $$0P:(DE-HGF)0$$aDagaev, Nikolay$$b0 001024830 245__ $$aA too-good-to-be-true prior to reduce shortcut reliance 001024830 260__ $$aAmsterdam [u.a.]$$bElsevier$$c2023 001024830 3367_ $$2DRIVER$$aarticle 001024830 3367_ $$2DataCite$$aOutput Types/Journal article 001024830 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1712669802_18043 001024830 3367_ $$2BibTeX$$aARTICLE 001024830 3367_ $$2ORCID$$aJOURNAL_ARTICLE 001024830 3367_ $$00$$2EndNote$$aJournal Article 001024830 520__ $$aDespite their impressive performance in object recognition and other tasks under standard testing conditions, deep networks often fail to generalize to out-of-distribution (o.o.d.) samples. One cause for this shortcoming is that modern architectures tend to rely on ǣshortcutsǥ superficial features that correlate with categories without capturing deeper invariants that hold across contexts. Real-world concepts often possess a complex structure that can vary superficially across contexts, which can make the most intuitive and promising solutions in one context not generalize to others. One potential way to improve o.o.d. generalization is to assume simple solutions are unlikely to be valid across contexts and avoid them, which we refer to as the too-good-to-be-true prior. A low-capacity network (LCN) with a shallow architecture should only be able to learn surface relationships, including shortcuts. We find that LCNs can serve as shortcut detectors. Furthermore, an LCN’s predictions can be used in a two-stage approach to encourage a high-capacity network (HCN) to rely on deeper invariant features that should generalize broadly. In particular, items that the LCN can master are downweighted when training the HCN. Using a modified version of the CIFAR-10 dataset in which we introduced shortcuts, we found that the two-stage LCN-HCN approach reduced reliance on shortcuts and facilitated o.o.d. generalization. 001024830 536__ $$0G:(DE-HGF)POF4-5251$$a5251 - Multilevel Brain Organization and Variability (POF4-525)$$cPOF4-525$$fPOF IV$$x0 001024830 536__ $$0G:(DE-HGF)POF4-5254$$a5254 - Neuroscientific Data Analytics and AI (POF4-525)$$cPOF4-525$$fPOF IV$$x1 001024830 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de 001024830 7001_ $$0P:(DE-HGF)0$$aRoads, Brett D.$$b1 001024830 7001_ $$0P:(DE-HGF)0$$aLuo, Xiaoliang$$b2 001024830 7001_ $$0P:(DE-HGF)0$$aBarry, Daniel N.$$b3 001024830 7001_ $$0P:(DE-Juel1)172843$$aPatil, Kaustubh R.$$b4 001024830 7001_ $$0P:(DE-HGF)0$$aLove, Bradley C.$$b5$$eCorresponding author 001024830 773__ $$0PERI:(DE-600)1466342-9$$a10.1016/j.patrec.2022.12.010$$gVol. 166, p. 164 - 171$$p164 - 171$$tPattern recognition letters$$v166$$x0167-8655$$y2023 001024830 8564_ $$uhttps://www.sciencedirect.com/science/article/pii/S0167865522003841?via%3Dihub 001024830 8564_ $$uhttps://juser.fz-juelich.de/record/1024830/files/1-s2.0-S0167865522003841-main-1.pdf$$yOpenAccess 001024830 8564_ $$uhttps://juser.fz-juelich.de/record/1024830/files/1-s2.0-S0167865522003841-main-1.gif?subformat=icon$$xicon$$yOpenAccess 001024830 8564_ $$uhttps://juser.fz-juelich.de/record/1024830/files/1-s2.0-S0167865522003841-main-1.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001024830 8564_ $$uhttps://juser.fz-juelich.de/record/1024830/files/1-s2.0-S0167865522003841-main-1.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001024830 8564_ $$uhttps://juser.fz-juelich.de/record/1024830/files/1-s2.0-S0167865522003841-main-1.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001024830 909CO $$ooai:juser.fz-juelich.de:1024830$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire 001024830 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)172843$$aForschungszentrum Jülich$$b4$$kFZJ 001024830 9101_ $$0I:(DE-HGF)0$$6P:(DE-Juel1)172843$$a HHU Düsseldorf$$b4 001024830 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$a Department of Experimental Psychology, University College London, London, United Kingdom$$b5 001024830 9131_ $$0G:(DE-HGF)POF4-525$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5251$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vDecoding Brain Organization and Dysfunction$$x0 001024830 9131_ $$0G:(DE-HGF)POF4-525$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5254$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vDecoding Brain Organization and Dysfunction$$x1 001024830 9141_ $$y2024 001024830 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)1160$$2StatID$$aDBCoverage$$bCurrent Contents - Engineering, Computing and Technology$$d2023-08-22 001024830 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0 001024830 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bPATTERN RECOGN LETT : 2022$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 001024830 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bASC$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)9905$$2StatID$$aIF >= 5$$bPATTERN RECOGN LETT : 2022$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2023-08-22 001024830 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2023-08-22$$wger 001024830 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2023-08-22 001024830 9201_ $$0I:(DE-Juel1)INM-7-20090406$$kINM-7$$lGehirn & Verhalten$$x0 001024830 980__ $$ajournal 001024830 980__ $$aVDB 001024830 980__ $$aUNRESTRICTED 001024830 980__ $$aI:(DE-Juel1)INM-7-20090406 001024830 9801_ $$aFullTexts