001048776 001__ 1048776
001048776 005__ 20251222202220.0
001048776 0247_ $$2doi$$a10.1103/5lgz-4t7h
001048776 0247_ $$2ISSN$$a2470-0045
001048776 0247_ $$2ISSN$$a2470-0061
001048776 0247_ $$2ISSN$$a1063-651X
001048776 0247_ $$2ISSN$$a1095-3787
001048776 0247_ $$2ISSN$$a1538-4519
001048776 0247_ $$2ISSN$$a1539-3755
001048776 0247_ $$2ISSN$$a1550-2376
001048776 0247_ $$2ISSN$$a2470-0053
001048776 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-04891
001048776 037__ $$aFZJ-2025-04891
001048776 082__ $$a530
001048776 1001_ $$0P:(DE-Juel1)180150$$aFischer, Kirsten$$b0$$eCorresponding author
001048776 245__ $$aField theory for optimal signal propagation in residual networks
001048776 260__ $$aWoodbury, NY$$bInst.$$c2025
001048776 3367_ $$2DRIVER$$aarticle
001048776 3367_ $$2DataCite$$aOutput Types/Journal article
001048776 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1766392108_24935
001048776 3367_ $$2BibTeX$$aARTICLE
001048776 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001048776 3367_ $$00$$2EndNote$$aJournal Article
001048776 520__ $$aResidual networks have significantly better trainability and thus performance than feed-forward networks at large depth. Introducing skip connections facilitates signal propagation to deeper layers. In addition, previous works found that adding a scaling parameter for the residual branch further improves generalization performance. While they empirically identified a particularly beneficial range of values for this scaling parameter, the mechanism for the resulting performance improvement and its universality across network hyperparameters remain an open question. For feed-forward networks, finite-size theories have led to important insights with regard to signal propagation and hyperparameter tuning. We here derive a systematic finite-size field theory for residual networks to study signal propagation and its dependence on the scaling for the residual branch. We derive analytical expressions for the response function, a measure for the network’s sensitivity to inputs, and show that for deep networks the empirically found values for the scaling parameter lie within the range of maximal sensitivity. Furthermore, we obtain an analytical expression for the optimal scaling parameter that depends only weakly on other network hyperparameters, such as the weight variance, thereby explaining its universality across hyperparameters. Overall, this work provides a theoretical framework to study ResNets at finite size.
001048776 536__ $$0G:(DE-HGF)POF4-5232$$a5232 - Computational Principles (POF4-523)$$cPOF4-523$$fPOF IV$$x0
001048776 536__ $$0G:(DE-HGF)POF4-5234$$a5234 - Emerging NC Architectures (POF4-523)$$cPOF4-523$$fPOF IV$$x1
001048776 536__ $$0G:(DE-Juel-1)BMBF-01IS19077A$$aRenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A)$$cBMBF-01IS19077A$$x2
001048776 536__ $$0G:(DE-Juel1)HGF-SMHB-2014-2018$$aMSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018)$$cHGF-SMHB-2014-2018$$fMSNN$$x3
001048776 536__ $$0G:(DE-HGF)SO-092$$aACA - Advanced Computing Architectures (SO-092)$$cSO-092$$x4
001048776 536__ $$0G:(DE-82)EXS-SF-neuroIC002$$aneuroIC002 - Recurrence and stochasticity for neuro-inspired computation (EXS-SF-neuroIC002)$$cEXS-SF-neuroIC002$$x5
001048776 536__ $$0G:(GEPRIS)491111487$$aDFG project G:(GEPRIS)491111487 - Open-Access-Publikationskosten / 2025 - 2027 / Forschungszentrum Jülich (OAPKFZJ) (491111487)$$c491111487$$x6
001048776 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001048776 7001_ $$0P:(DE-Juel1)156459$$aDahmen, David$$b1
001048776 7001_ $$0P:(DE-Juel1)144806$$aHelias, Moritz$$b2
001048776 773__ $$0PERI:(DE-600)2844562-4$$a10.1103/5lgz-4t7h$$gVol. 112, no. 6, p. 065301$$n6$$p065301$$tPhysical review / E$$v112$$x2470-0045$$y2025
001048776 8564_ $$uhttps://juser.fz-juelich.de/record/1048776/files/5lgz-4t7h.pdf$$yOpenAccess
001048776 909CO $$ooai:juser.fz-juelich.de:1048776$$popenaire$$popen_access$$pVDB$$pdriver$$pdnbdelivery
001048776 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180150$$aForschungszentrum Jülich$$b0$$kFZJ
001048776 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)156459$$aForschungszentrum Jülich$$b1$$kFZJ
001048776 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144806$$aForschungszentrum Jülich$$b2$$kFZJ
001048776 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5232$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x0
001048776 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5234$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x1
001048776 9141_ $$y2025
001048776 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)1230$$2StatID$$aDBCoverage$$bCurrent Contents - Electronics and Telecommunications Collection$$d2024-12-10
001048776 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001048776 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)1150$$2StatID$$aDBCoverage$$bCurrent Contents - Physical, Chemical and Earth Sciences$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)9900$$2StatID$$aIF < 5$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001048776 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bASC$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bPHYS REV E : 2022$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2024-12-10
001048776 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2024-12-10
001048776 920__ $$lyes
001048776 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lComputational and Systems Neuroscience$$x0
001048776 980__ $$ajournal
001048776 980__ $$aVDB
001048776 980__ $$aUNRESTRICTED
001048776 980__ $$aI:(DE-Juel1)IAS-6-20130828
001048776 9801_ $$aFullTexts