TY - CONF
AU - Fischer, Kirsten
AU - Dahmen, David
AU - Helias, Moritz
TI - Response functions in residual networks as a measure for signal propagation
M1 - FZJ-2025-02060
PY - 2025
AB - Residual networks (ResNets) demonstrate superior trainability and performance compared to feed-forward networks, particularly at greater depths, due to the introduction of skip connections that enhance signal propagation to deeper layers. Prior studies have shown that incorporating a scaling parameter into the residual branch can further improve generalization performance. However, the underlying mechanisms behind these effects and their robustness across network hyperparameters remain unclear.For feed-forward networks, finite-size theories have proven valuable in understanding signal propagation and optimizing hyperparameters. Extending this approach to ResNets, we develop a finite-size field theory to systematically analyze signal propagation and its dependence on the residual branch's scaling parameter. Through this framework, we derive analytical expressions for the response function, which measures the network's sensitivity to varying inputs. We obtain a formula for the optimal scaling parameter, revealing that it depends minimally on other hyperparameters, such as weight variance, thereby explaining its universality across hyperparameter configurations.
T2 - DPG Spring Meeting of the Condensed Matter Section
CY - 16 Mar 2025 - 21 Mar 2025, Regensburg (Germany)
Y2 - 16 Mar 2025 - 21 Mar 2025
M2 - Regensburg, Germany
LB - PUB:(DE-HGF)6
UR - https://juser.fz-juelich.de/record/1040930
ER -