%0 Conference Paper
%A Penke, Carolin
%T Mathematical Techniques to Reduce Memory Requirements in Deep Learning
%M FZJ-2024-06888
%D 2024
%X We present a method to substantially lower memory requirements during the training of deep neural networks, based on the GaLore (Gradient Low-Rank Projection) training framework. A rapid decay of singular values in gradient matrices permits the use of low-rank bases to encapsulate the relevant subspaces, reducing the memory requirements for storing optimizer states between iterations. A novel, rank-adaptive, GPU-optimized version of the randomized range finder algorithm is employed to exploit this property and future research directions are discussed.
%B OpenGPT-X Forum 2024
%C 5 Nov 2024 - 5 Nov 2024, Berlin (Germany)
Y2 5 Nov 2024 - 5 Nov 2024
M2 Berlin, Germany
%F PUB:(DE-HGF)6
%9 Conference Presentation
%R 10.34734/FZJ-2024-06888
%U https://juser.fz-juelich.de/record/1034067