Conference Presentation (After Call) FZJ-2025-02060

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Response functions in residual networks as a measure for signal propagation

 ;  ;

2025

DPG Spring Meeting of the Condensed Matter Section, RegensburgRegensburg, Germany, 16 Mar 2025 - 21 Mar 20252025-03-162025-03-21

Abstract: Residual networks (ResNets) demonstrate superior trainability and performance compared to feed-forward networks, particularly at greater depths, due to the introduction of skip connections that enhance signal propagation to deeper layers. Prior studies have shown that incorporating a scaling parameter into the residual branch can further improve generalization performance. However, the underlying mechanisms behind these effects and their robustness across network hyperparameters remain unclear.For feed-forward networks, finite-size theories have proven valuable in understanding signal propagation and optimizing hyperparameters. Extending this approach to ResNets, we develop a finite-size field theory to systematically analyze signal propagation and its dependence on the residual branch's scaling parameter. Through this framework, we derive analytical expressions for the response function, which measures the network's sensitivity to varying inputs. We obtain a formula for the optimal scaling parameter, revealing that it depends minimally on other hyperparameters, such as weight variance, thereby explaining its universality across hyperparameter configurations.


Contributing Institute(s):
  1. Computational and Systems Neuroscience (IAS-6)
Research Program(s):
  1. 5232 - Computational Principles (POF4-523) (POF4-523)
  2. 5234 - Emerging NC Architectures (POF4-523) (POF4-523)
  3. MSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018) (HGF-SMHB-2014-2018)
  4. RenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A) (BMBF-01IS19077A)
  5. ACA - Advanced Computing Architectures (SO-092) (SO-092)
  6. neuroIC002 - Recurrence and stochasticity for neuro-inspired computation (EXS-SF-neuroIC002) (EXS-SF-neuroIC002)
  7. GRK 2416 - GRK 2416: MultiSenses-MultiScales: Neue Ansätze zur Aufklärung neuronaler multisensorischer Integration (368482240) (368482240)

Appears in the scientific report 2025
Click to display QR Code for this record

The record appears in these collections:
Document types > Presentations > Conference Presentations
Institute Collections > IAS > IAS-6
Workflow collections > Public records
Publications database

 Record created 2025-03-24, last modified 2025-04-11


External link:
Download fulltext
Fulltext
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)