Poster (After Call) FZJ-2024-05061

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Critical feature learning in deep neural networks

 ;  ;  ;  ;  ;

2024

The Forty-first International Conference on Machine Learning, WienWien, Austria, 21 Jul 2024 - 27 Jul 20242024-07-212024-07-27

Abstract: A key property of neural networks driving their success is their ability to learn features from data. Understanding feature learning from a theoretical viewpoint is an emerging field with many open questions. In this work we capture finite-width effects with a systematic theory of network kernels in deep non-linear neural networks. We show that the Bayesian prior of the network can be written in closed form as a superposition of Gaussian processes, whose kernels are distributed with a variance that depends inversely on the network width N . A large deviation approach, which is exact in the proportional limit for the number of data points P=αN→∞, yields a pair of forward-backward equations for the maximum a posteriori kernels in all layers at once. We study their solutions perturbatively to demonstrate how the backward propagation across layers aligns kernels with the target. An alternative field-theoretic formulation shows that kernel adaptation of the Bayesian posterior at finite-width results from fluctuations in the prior: larger fluctuations correspond to a more flexible network prior and thus enable stronger adaptation to data. We thus find a bridge between the classical edge-of-chaos NNGP theory and feature learning, exposing an intricate interplay between criticality, response functions, and feature scale.


Contributing Institute(s):
  1. Computational and Systems Neuroscience (IAS-6)
Research Program(s):
  1. 5232 - Computational Principles (POF4-523) (POF4-523)
  2. 5234 - Emerging NC Architectures (POF4-523) (POF4-523)
  3. RenormalizedFlows - Transparent Deep Learning with Renormalized Flows (BMBF-01IS19077A) (BMBF-01IS19077A)
  4. MSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018) (HGF-SMHB-2014-2018)
  5. ACA - Advanced Computing Architectures (SO-092) (SO-092)

Appears in the scientific report 2024
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Präsentationen > Poster
Institutssammlungen > IAS > IAS-6
Workflowsammlungen > Öffentliche Einträge
Publikationsdatenbank

 Datensatz erzeugt am 2024-07-29, letzte Änderung am 2024-11-08



Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)