001     1048776
005     20251222202220.0
024 7 _ |a 10.1103/5lgz-4t7h
|2 doi
024 7 _ |a 2470-0045
|2 ISSN
024 7 _ |a 2470-0061
|2 ISSN
024 7 _ |a 1063-651X
|2 ISSN
024 7 _ |a 1095-3787
|2 ISSN
024 7 _ |a 1538-4519
|2 ISSN
024 7 _ |a 1539-3755
|2 ISSN
024 7 _ |a 1550-2376
|2 ISSN
024 7 _ |a 2470-0053
|2 ISSN
024 7 _ |a 10.34734/FZJ-2025-04891
|2 datacite_doi
037 _ _ |a FZJ-2025-04891
082 _ _ |a 530
100 1 _ |a Fischer, Kirsten
|0 P:(DE-Juel1)180150
|b 0
|e Corresponding author
245 _ _ |a Field theory for optimal signal propagation in residual networks
260 _ _ |a Woodbury, NY
|c 2025
|b Inst.
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1766392108_24935
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a Residual networks have significantly better trainability and thus performance than feed-forward networks at large depth. Introducing skip connections facilitates signal propagation to deeper layers. In addition, previous works found that adding a scaling parameter for the residual branch further improves generalization performance. While they empirically identified a particularly beneficial range of values for this scaling parameter, the mechanism for the resulting performance improvement and its universality across network hyperparameters remain an open question. For feed-forward networks, finite-size theories have led to important insights with regard to signal propagation and hyperparameter tuning. We here derive a systematic finite-size field theory for residual networks to study signal propagation and its dependence on the scaling for the residual branch. We derive analytical expressions for the response function, a measure for the network’s sensitivity to inputs, and show that for deep networks the empirically found values for the scaling parameter lie within the range of maximal sensitivity. Furthermore, we obtain an analytical expression for the optimal scaling parameter that depends only weakly on other network hyperparameters, such as the weight variance, thereby explaining its universality across hyperparameters. Overall, this work provides a theoretical framework to study ResNets at finite size.
536 _ _ |a 5232 - Computational Principles (POF4-523)
|0 G:(DE-HGF)POF4-5232
|c POF4-523
|f POF IV
|x 0
536 _ _ |a 5234 - Emerging NC Architectures (POF4-523)
|0 G:(DE-HGF)POF4-5234
|c POF4-523
|f POF IV
|x 1
536 _ _ |a RenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A)
|0 G:(DE-Juel-1)BMBF-01IS19077A
|c BMBF-01IS19077A
|x 2
536 _ _ |a MSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018)
|0 G:(DE-Juel1)HGF-SMHB-2014-2018
|c HGF-SMHB-2014-2018
|f MSNN
|x 3
536 _ _ |a ACA - Advanced Computing Architectures (SO-092)
|0 G:(DE-HGF)SO-092
|c SO-092
|x 4
536 _ _ |a neuroIC002 - Recurrence and stochasticity for neuro-inspired computation (EXS-SF-neuroIC002)
|0 G:(DE-82)EXS-SF-neuroIC002
|c EXS-SF-neuroIC002
|x 5
536 _ _ |a DFG project G:(GEPRIS)491111487 - Open-Access-Publikationskosten / 2025 - 2027 / Forschungszentrum Jülich (OAPKFZJ) (491111487)
|0 G:(GEPRIS)491111487
|c 491111487
|x 6
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |a Dahmen, David
|0 P:(DE-Juel1)156459
|b 1
700 1 _ |a Helias, Moritz
|0 P:(DE-Juel1)144806
|b 2
773 _ _ |a 10.1103/5lgz-4t7h
|g Vol. 112, no. 6, p. 065301
|0 PERI:(DE-600)2844562-4
|n 6
|p 065301
|t Physical review / E
|v 112
|y 2025
|x 2470-0045
856 4 _ |u https://juser.fz-juelich.de/record/1048776/files/5lgz-4t7h.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1048776
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)180150
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)156459
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)144806
913 1 _ |a DE-HGF
|b Key Technologies
|l Natural, Artificial and Cognitive Information Processing
|1 G:(DE-HGF)POF4-520
|0 G:(DE-HGF)POF4-523
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Neuromorphic Computing and Network Dynamics
|9 G:(DE-HGF)POF4-5232
|x 0
913 1 _ |a DE-HGF
|b Key Technologies
|l Natural, Artificial and Cognitive Information Processing
|1 G:(DE-HGF)POF4-520
|0 G:(DE-HGF)POF4-523
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Neuromorphic Computing and Network Dynamics
|9 G:(DE-HGF)POF4-5234
|x 1
914 1 _ |y 2025
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1230
|2 StatID
|b Current Contents - Electronics and Telecommunications Collection
|d 2024-12-10
915 _ _ |a Creative Commons Attribution CC BY 4.0
|0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1150
|2 StatID
|b Current Contents - Physical, Chemical and Earth Sciences
|d 2024-12-10
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2024-12-10
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
|d 2024-12-10
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b ASC
|d 2024-12-10
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b PHYS REV E : 2022
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2024-12-10
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2024-12-10
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)IAS-6-20130828
|k IAS-6
|l Computational and Systems Neuroscience
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)IAS-6-20130828
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21