001052069 001__ 1052069
001052069 005__ 20260121204317.0
001052069 037__ $$aFZJ-2026-00739
001052069 041__ $$aEnglish
001052069 1001_ $$0P:(DE-HGF)0$$aRubin, Noa$$b0$$eCorresponding author
001052069 1112_ $$aThe 42nd International Conference on Machine Learning$$cVancouver$$d2025-07-13 - 2025-07-19$$gICML 2025$$wCanada
001052069 245__ $$aFrom Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning
001052069 260__ $$c2025
001052069 3367_ $$033$$2EndNote$$aConference Paper
001052069 3367_ $$2DataCite$$aOther
001052069 3367_ $$2BibTeX$$aINPROCEEDINGS
001052069 3367_ $$2DRIVER$$aconferenceObject
001052069 3367_ $$2ORCID$$aLECTURE_SPEECH
001052069 3367_ $$0PUB:(DE-HGF)6$$2PUB:(DE-HGF)$$aConference Presentation$$bconf$$mconf$$s1768997418_9544$$xAfter Call
001052069 520__ $$aFeature learning in neural networks is crucial fortheir expressive power and inductive biases, moti-vating various theoretical approaches. Some ap-proaches describe network behavior after train-ing through a change in kernel scale from initial-ization, resulting in a generalization power com-parable to a Gaussian process. Conversely, inother approaches training results in the adapta-tion of the kernel to the data, involving directionalchanges to the kernel. The relationship and re-spective strengths of these two views have so farremained unresolved. This work presents a theo-retical framework of multi-scale adaptive featurelearning bridging these two views. Using methodsfrom statistical mechanics, we derive analyticalexpressions for network output statistics whichare valid across scaling regimes and in the contin-uum between them. A systematic expansion ofthe network’s probability distribution reveals thatmean-field scaling requires only a saddle-pointapproximation, while standard scaling necessi-tates additional correction terms. Remarkably,we find across regimes that kernel adaptation canbe reduced to an effective kernel rescaling whenpredicting the mean network output in the spe-cial case of a linear network. However, for linearand non-linear networks, the multi-scale adaptiveapproach captures directional feature learning ef-fects, providing richer insights than what couldbe recovered from a rescaling of the kernel alone
001052069 536__ $$0G:(DE-HGF)POF4-5232$$a5232 - Computational Principles (POF4-523)$$cPOF4-523$$fPOF IV$$x0
001052069 536__ $$0G:(DE-HGF)POF4-5234$$a5234 - Emerging NC Architectures (POF4-523)$$cPOF4-523$$fPOF IV$$x1
001052069 536__ $$0G:(DE-Juel1)HGF-SMHB-2014-2018$$aMSNN - Theory of multi-scale neuronal networks (HGF-SMHB-2014-2018)$$cHGF-SMHB-2014-2018$$fMSNN$$x2
001052069 536__ $$0G:(DE-HGF)SO-092$$aACA - Advanced Computing Architectures (SO-092)$$cSO-092$$x3
001052069 536__ $$0G:(GEPRIS)368482240$$aGRK 2416 - GRK 2416: MultiSenses-MultiScales: Neue Ansätze zur Aufklärung neuronaler multisensorischer Integration (368482240)$$c368482240$$x4
001052069 65027 $$0V:(DE-MLZ)SciArea-250$$2V:(DE-HGF)$$aOthers$$x0
001052069 7001_ $$0P:(DE-Juel1)180150$$aFischer, Kirsten$$b1$$eCorresponding author$$ufzj
001052069 7001_ $$0P:(DE-Juel1)185990$$aLindner, Javed$$b2$$eCorresponding author$$ufzj
001052069 7001_ $$0P:(DE-Juel1)156459$$aDahmen, David$$b3$$ufzj
001052069 7001_ $$0P:(DE-HGF)0$$aSeroussi, Inbar$$b4
001052069 7001_ $$0P:(DE-HGF)0$$aRingel, Zohar$$b5
001052069 7001_ $$0P:(DE-HGF)0$$aMichael, Krämer$$b6
001052069 7001_ $$0P:(DE-Juel1)144806$$aHelias, Moritz$$b7$$ufzj
001052069 8564_ $$uhttps://icml.cc/virtual/2025/poster/44430
001052069 909CO $$ooai:juser.fz-juelich.de:1052069$$pVDB
001052069 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180150$$aForschungszentrum Jülich$$b1$$kFZJ
001052069 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)185990$$aForschungszentrum Jülich$$b2$$kFZJ
001052069 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)156459$$aForschungszentrum Jülich$$b3$$kFZJ
001052069 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144806$$aForschungszentrum Jülich$$b7$$kFZJ
001052069 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5232$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x0
001052069 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5234$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x1
001052069 920__ $$lyes
001052069 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lComputational and Systems Neuroscience$$x0
001052069 980__ $$aconf
001052069 980__ $$aVDB
001052069 980__ $$aI:(DE-Juel1)IAS-6-20130828
001052069 980__ $$aUNRESTRICTED