001029334 001__ 1029334 001029334 005__ 20241108205838.0 001029334 037__ $$aFZJ-2024-05061 001029334 1001_ $$0P:(DE-Juel1)180150$$aFischer, Kirsten$$b0$$eCorresponding author$$ufzj 001029334 1112_ $$aThe Forty-first International Conference on Machine Learning$$cWien$$d2024-07-21 - 2024-07-27$$wAustria 001029334 245__ $$aCritical feature learning in deep neural networks 001029334 260__ $$c2024 001029334 3367_ $$033$$2EndNote$$aConference Paper 001029334 3367_ $$2BibTeX$$aINPROCEEDINGS 001029334 3367_ $$2DRIVER$$aconferenceObject 001029334 3367_ $$2ORCID$$aCONFERENCE_POSTER 001029334 3367_ $$2DataCite$$aOutput Types/Conference Poster 001029334 3367_ $$0PUB:(DE-HGF)24$$2PUB:(DE-HGF)$$aPoster$$bposter$$mposter$$s1731048220_24645$$xAfter Call 001029334 520__ $$aA key property of neural networks driving their success is their ability to learn features from data. Understanding feature learning from a theoretical viewpoint is an emerging field with many open questions. In this work we capture finite-width effects with a systematic theory of network kernels in deep non-linear neural networks. We show that the Bayesian prior of the network can be written in closed form as a superposition of Gaussian processes, whose kernels are distributed with a variance that depends inversely on the network width N . A large deviation approach, which is exact in the proportional limit for the number of data points P=αN→∞, yields a pair of forward-backward equations for the maximum a posteriori kernels in all layers at once. We study their solutions perturbatively to demonstrate how the backward propagation across layers aligns kernels with the target. An alternative field-theoretic formulation shows that kernel adaptation of the Bayesian posterior at finite-width results from fluctuations in the prior: larger fluctuations correspond to a more flexible network prior and thus enable stronger adaptation to data. We thus find a bridge between the classical edge-of-chaos NNGP theory and feature learning, exposing an intricate interplay between criticality, response functions, and feature scale. 001029334 536__ $$0G:(DE-HGF)POF4-5232$$a5232 - Computational Principles (POF4-523)$$cPOF4-523$$fPOF IV$$x0 001029334 536__ $$0G:(DE-HGF)POF4-5234$$a5234 - Emerging NC Architectures (POF4-523)$$cPOF4-523$$fPOF IV$$x1 001029334 536__ $$0G:(DE-Juel-1)BMBF-01IS19077A$$aRenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A)$$cBMBF-01IS19077A$$x2 001029334 536__ $$0G:(DE-Juel1)HGF-SMHB-2014-2018$$aMSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018)$$cHGF-SMHB-2014-2018$$fMSNN$$x3 001029334 536__ $$0G:(DE-HGF)SO-092$$aACA - Advanced Computing Architectures (SO-092)$$cSO-092$$x4 001029334 7001_ $$0P:(DE-Juel1)185990$$aLindner, Javed$$b1$$eCorresponding author$$ufzj 001029334 7001_ $$0P:(DE-Juel1)156459$$aDahmen, David$$b2$$ufzj 001029334 7001_ $$0P:(DE-HGF)0$$aRingel, Zohar$$b3 001029334 7001_ $$0P:(DE-HGF)0$$aKrämer, Michael$$b4 001029334 7001_ $$0P:(DE-Juel1)144806$$aHelias, Moritz$$b5$$ufzj 001029334 909CO $$ooai:juser.fz-juelich.de:1029334$$pVDB 001029334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180150$$aForschungszentrum Jülich$$b0$$kFZJ 001029334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)185990$$aForschungszentrum Jülich$$b1$$kFZJ 001029334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)156459$$aForschungszentrum Jülich$$b2$$kFZJ 001029334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144806$$aForschungszentrum Jülich$$b5$$kFZJ 001029334 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5232$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x0 001029334 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5234$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x1 001029334 9141_ $$y2024 001029334 920__ $$lyes 001029334 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lComputational and Systems Neuroscience$$x0 001029334 980__ $$aposter 001029334 980__ $$aVDB 001029334 980__ $$aI:(DE-Juel1)IAS-6-20130828 001029334 980__ $$aUNRESTRICTED