TY  - CONF
AU  - Rubin, Noa
AU  - Fischer, Kirsten
AU  - Lindner, Javed
AU  - Dahmen, David
AU  - Seroussi, Inbar
AU  - Ringel, Zohar
AU  - Michael, Krämer
AU  - Helias, Moritz
TI  - From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning
M1  - FZJ-2026-00739
PY  - 2025
AB  - Feature learning in neural networks is crucial fortheir expressive power and inductive biases, moti-vating various theoretical approaches. Some ap-proaches describe network behavior after train-ing through a change in kernel scale from initial-ization, resulting in a generalization power com-parable to a Gaussian process. Conversely, inother approaches training results in the adapta-tion of the kernel to the data, involving directionalchanges to the kernel. The relationship and re-spective strengths of these two views have so farremained unresolved. This work presents a theo-retical framework of multi-scale adaptive featurelearning bridging these two views. Using methodsfrom statistical mechanics, we derive analyticalexpressions for network output statistics whichare valid across scaling regimes and in the contin-uum between them. A systematic expansion ofthe network’s probability distribution reveals thatmean-field scaling requires only a saddle-pointapproximation, while standard scaling necessi-tates additional correction terms. Remarkably,we find across regimes that kernel adaptation canbe reduced to an effective kernel rescaling whenpredicting the mean network output in the spe-cial case of a linear network. However, for linearand non-linear networks, the multi-scale adaptiveapproach captures directional feature learning ef-fects, providing richer insights than what couldbe recovered from a rescaling of the kernel alone
T2  - The 42nd International Conference on Machine Learning
CY  - 13 Jul 2025 - 19 Jul 2025, Vancouver (Canada)
Y2  - 13 Jul 2025 - 19 Jul 2025
M2  - Vancouver, Canada
LB  - PUB:(DE-HGF)6
UR  - https://juser.fz-juelich.de/record/1052069
ER  -