Response functions in residual networks as a measure for signal propagation

Fischer, Kirsten; Dahmen, David; Helias, Moritz
001040930 001__ 1040930
001040930 005__ 20250411203140.0
001040930 037__ $$aFZJ-2025-02060
001040930 1001_ $$0P:(DE-Juel1)180150$$aFischer, Kirsten$$b0$$eCorresponding author$$ufzj
001040930 1112_ $$aDPG Spring Meeting of the Condensed Matter Section$$cRegensburg$$d2025-03-16 - 2025-03-21$$wGermany
001040930 245__ $$aResponse functions in residual networks as a measure for signal propagation
001040930 260__ $$c2025
001040930 3367_ $$033$$2EndNote$$aConference Paper
001040930 3367_ $$2DataCite$$aOther
001040930 3367_ $$2BibTeX$$aINPROCEEDINGS
001040930 3367_ $$2DRIVER$$aconferenceObject
001040930 3367_ $$2ORCID$$aLECTURE_SPEECH
001040930 3367_ $$0PUB:(DE-HGF)6$$2PUB:(DE-HGF)$$aConference Presentation$$bconf$$mconf$$s1744375014_4023$$xAfter Call
001040930 520__ $$aResidual networks (ResNets) demonstrate superior trainability and performance compared to feed-forward networks, particularly at greater depths, due to the introduction of skip connections that enhance signal propagation to deeper layers. Prior studies have shown that incorporating a scaling parameter into the residual branch can further improve generalization performance. However, the underlying mechanisms behind these effects and their robustness across network hyperparameters remain unclear.For feed-forward networks, finite-size theories have proven valuable in understanding signal propagation and optimizing hyperparameters. Extending this approach to ResNets, we develop a finite-size field theory to systematically analyze signal propagation and its dependence on the residual branch's scaling parameter. Through this framework, we derive analytical expressions for the response function, which measures the network's sensitivity to varying inputs. We obtain a formula for the optimal scaling parameter, revealing that it depends minimally on other hyperparameters, such as weight variance, thereby explaining its universality across hyperparameter configurations.
001040930 536__ $$0G:(DE-HGF)POF4-5232$$a5232 - Computational Principles (POF4-523)$$cPOF4-523$$fPOF IV$$x0
001040930 536__ $$0G:(DE-HGF)POF4-5234$$a5234 - Emerging NC Architectures (POF4-523)$$cPOF4-523$$fPOF IV$$x1
001040930 536__ $$0G:(DE-Juel1)HGF-SMHB-2014-2018$$aMSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018)$$cHGF-SMHB-2014-2018$$fMSNN$$x2
001040930 536__ $$0G:(DE-Juel-1)BMBF-01IS19077A$$aRenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A)$$cBMBF-01IS19077A$$x3
001040930 536__ $$0G:(DE-HGF)SO-092$$aACA - Advanced Computing Architectures (SO-092)$$cSO-092$$x4
001040930 536__ $$0G:(DE-82)EXS-SF-neuroIC002$$aneuroIC002 - Recurrence and stochasticity for neuro-inspired computation (EXS-SF-neuroIC002)$$cEXS-SF-neuroIC002$$x5
001040930 536__ $$0G:(GEPRIS)368482240$$aGRK 2416 - GRK 2416: MultiSenses-MultiScales: Neue Ansätze zur Aufklärung neuronaler multisensorischer Integration (368482240)$$c368482240$$x6
001040930 7001_ $$0P:(DE-Juel1)156459$$aDahmen, David$$b1$$ufzj
001040930 7001_ $$0P:(DE-Juel1)144806$$aHelias, Moritz$$b2$$ufzj
001040930 8564_ $$uhttps://www.dpg-verhandlungen.de/year/2025/conference/regensburg/part/soe/session/7/contribution/4
001040930 909CO $$ooai:juser.fz-juelich.de:1040930$$pVDB
001040930 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180150$$aForschungszentrum Jülich$$b0$$kFZJ
001040930 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)156459$$aForschungszentrum Jülich$$b1$$kFZJ
001040930 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144806$$aForschungszentrum Jülich$$b2$$kFZJ
001040930 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5232$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x0
001040930 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5234$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x1
001040930 9141_ $$y2025
001040930 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lComputational and Systems Neuroscience$$x0
001040930 980__ $$aconf
001040930 980__ $$aVDB
001040930 980__ $$aI:(DE-Juel1)IAS-6-20130828
001040930 980__ $$aUNRESTRICTED
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help