TY  - CONF
AU  - Penke, Carolin
TI  - Mathematical Techniques to Reduce Memory Requirements in Deep Learning
M1  - FZJ-2024-06888
PY  - 2024
AB  - We present a method to substantially lower memory requirements during the training of deep neural networks, based on the GaLore (Gradient Low-Rank Projection) training framework. A rapid decay of singular values in gradient matrices permits the use of low-rank bases to encapsulate the relevant subspaces, reducing the memory requirements for storing optimizer states between iterations. A novel, rank-adaptive, GPU-optimized version of the randomized range finder algorithm is employed to exploit this property and future research directions are discussed.
T2  - OpenGPT-X Forum 2024
CY  - 5 Nov 2024 - 5 Nov 2024, Berlin (Germany)
Y2  - 5 Nov 2024 - 5 Nov 2024
M2  - Berlin, Germany
LB  - PUB:(DE-HGF)6
DO  - DOI:10.34734/FZJ-2024-06888
UR  - https://juser.fz-juelich.de/record/1034067
ER  -