| 001 | 1029334 | ||
| 005 | 20241108205838.0 | ||
| 037 | _ | _ | |a FZJ-2024-05061 |
| 100 | 1 | _ | |a Fischer, Kirsten |0 P:(DE-Juel1)180150 |b 0 |e Corresponding author |u fzj |
| 111 | 2 | _ | |a The Forty-first International Conference on Machine Learning |c Wien |d 2024-07-21 - 2024-07-27 |w Austria |
| 245 | _ | _ | |a Critical feature learning in deep neural networks |
| 260 | _ | _ | |c 2024 |
| 336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
| 336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
| 336 | 7 | _ | |a conferenceObject |2 DRIVER |
| 336 | 7 | _ | |a CONFERENCE_POSTER |2 ORCID |
| 336 | 7 | _ | |a Output Types/Conference Poster |2 DataCite |
| 336 | 7 | _ | |a Poster |b poster |m poster |0 PUB:(DE-HGF)24 |s 1731048220_24645 |2 PUB:(DE-HGF) |x After Call |
| 520 | _ | _ | |a A key property of neural networks driving their success is their ability to learn features from data. Understanding feature learning from a theoretical viewpoint is an emerging field with many open questions. In this work we capture finite-width effects with a systematic theory of network kernels in deep non-linear neural networks. We show that the Bayesian prior of the network can be written in closed form as a superposition of Gaussian processes, whose kernels are distributed with a variance that depends inversely on the network width N . A large deviation approach, which is exact in the proportional limit for the number of data points P=αN→∞, yields a pair of forward-backward equations for the maximum a posteriori kernels in all layers at once. We study their solutions perturbatively to demonstrate how the backward propagation across layers aligns kernels with the target. An alternative field-theoretic formulation shows that kernel adaptation of the Bayesian posterior at finite-width results from fluctuations in the prior: larger fluctuations correspond to a more flexible network prior and thus enable stronger adaptation to data. We thus find a bridge between the classical edge-of-chaos NNGP theory and feature learning, exposing an intricate interplay between criticality, response functions, and feature scale. |
| 536 | _ | _ | |a 5232 - Computational Principles (POF4-523) |0 G:(DE-HGF)POF4-5232 |c POF4-523 |f POF IV |x 0 |
| 536 | _ | _ | |a 5234 - Emerging NC Architectures (POF4-523) |0 G:(DE-HGF)POF4-5234 |c POF4-523 |f POF IV |x 1 |
| 536 | _ | _ | |a RenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A) |0 G:(DE-Juel-1)BMBF-01IS19077A |c BMBF-01IS19077A |x 2 |
| 536 | _ | _ | |a MSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018) |0 G:(DE-Juel1)HGF-SMHB-2014-2018 |c HGF-SMHB-2014-2018 |f MSNN |x 3 |
| 536 | _ | _ | |a ACA - Advanced Computing Architectures (SO-092) |0 G:(DE-HGF)SO-092 |c SO-092 |x 4 |
| 700 | 1 | _ | |a Lindner, Javed |0 P:(DE-Juel1)185990 |b 1 |e Corresponding author |u fzj |
| 700 | 1 | _ | |a Dahmen, David |0 P:(DE-Juel1)156459 |b 2 |u fzj |
| 700 | 1 | _ | |a Ringel, Zohar |0 P:(DE-HGF)0 |b 3 |
| 700 | 1 | _ | |a Krämer, Michael |0 P:(DE-HGF)0 |b 4 |
| 700 | 1 | _ | |a Helias, Moritz |0 P:(DE-Juel1)144806 |b 5 |u fzj |
| 909 | C | O | |o oai:juser.fz-juelich.de:1029334 |p VDB |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)180150 |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)185990 |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 2 |6 P:(DE-Juel1)156459 |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 5 |6 P:(DE-Juel1)144806 |
| 913 | 1 | _ | |a DE-HGF |b Key Technologies |l Natural, Artificial and Cognitive Information Processing |1 G:(DE-HGF)POF4-520 |0 G:(DE-HGF)POF4-523 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Neuromorphic Computing and Network Dynamics |9 G:(DE-HGF)POF4-5232 |x 0 |
| 913 | 1 | _ | |a DE-HGF |b Key Technologies |l Natural, Artificial and Cognitive Information Processing |1 G:(DE-HGF)POF4-520 |0 G:(DE-HGF)POF4-523 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Neuromorphic Computing and Network Dynamics |9 G:(DE-HGF)POF4-5234 |x 1 |
| 914 | 1 | _ | |y 2024 |
| 920 | _ | _ | |l yes |
| 920 | 1 | _ | |0 I:(DE-Juel1)IAS-6-20130828 |k IAS-6 |l Computational and Systems Neuroscience |x 0 |
| 980 | _ | _ | |a poster |
| 980 | _ | _ | |a VDB |
| 980 | _ | _ | |a I:(DE-Juel1)IAS-6-20130828 |
| 980 | _ | _ | |a UNRESTRICTED |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|