TY - CONF
AU - Penke, Carolin
TI - Mathematical Techniques to Reduce Memory Requirements in Deep Learning
M1 - FZJ-2024-06888
PY - 2024
AB - We present a method to substantially lower memory requirements during the training of deep neural networks, based on the GaLore (Gradient Low-Rank Projection) training framework. A rapid decay of singular values in gradient matrices permits the use of low-rank bases to encapsulate the relevant subspaces, reducing the memory requirements for storing optimizer states between iterations. A novel, rank-adaptive, GPU-optimized version of the randomized range finder algorithm is employed to exploit this property and future research directions are discussed.
T2 - OpenGPT-X Forum 2024
CY - 5 Nov 2024 - 5 Nov 2024, Berlin (Germany)
Y2 - 5 Nov 2024 - 5 Nov 2024
M2 - Berlin, Germany
LB - PUB:(DE-HGF)6
DO - DOI:10.34734/FZJ-2024-06888
UR - https://juser.fz-juelich.de/record/1034067
ER -