%0 Conference Paper
%A Rubin, Noa
%A Fischer, Kirsten
%A Lindner, Javed
%A Dahmen, David
%A Seroussi, Inbar
%A Ringel, Zohar
%A Michael, Krämer
%A Helias, Moritz
%T From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning
%M FZJ-2026-00739
%D 2025
%X Feature learning in neural networks is crucial fortheir expressive power and inductive biases, moti-vating various theoretical approaches. Some ap-proaches describe network behavior after train-ing through a change in kernel scale from initial-ization, resulting in a generalization power com-parable to a Gaussian process. Conversely, inother approaches training results in the adapta-tion of the kernel to the data, involving directionalchanges to the kernel. The relationship and re-spective strengths of these two views have so farremained unresolved. This work presents a theo-retical framework of multi-scale adaptive featurelearning bridging these two views. Using methodsfrom statistical mechanics, we derive analyticalexpressions for network output statistics whichare valid across scaling regimes and in the contin-uum between them. A systematic expansion ofthe network’s probability distribution reveals thatmean-field scaling requires only a saddle-pointapproximation, while standard scaling necessi-tates additional correction terms. Remarkably,we find across regimes that kernel adaptation canbe reduced to an effective kernel rescaling whenpredicting the mean network output in the spe-cial case of a linear network. However, for linearand non-linear networks, the multi-scale adaptiveapproach captures directional feature learning ef-fects, providing richer insights than what couldbe recovered from a rescaling of the kernel alone
%B The 42nd International Conference on Machine Learning
%C 13 Jul 2025 - 19 Jul 2025, Vancouver (Canada)
Y2 13 Jul 2025 - 19 Jul 2025
M2 Vancouver, Canada
%F PUB:(DE-HGF)6
%9 Conference Presentation
%U https://juser.fz-juelich.de/record/1052069