TY  - CONF
AU  - Penke, Carolin
AU  - John, Chelsea Maria
AU  - Herten, Andreas
AU  - Ebert, Jan
AU  - Kesselheim, Stefan
AU  - Suarez, Estela
TI  - OpenGPT-X - Training Large Language Models on HPC Systems
M1  - FZJ-2022-03599
PY  - 2022
AB  - Artificial neural networks represent an HPC workload with increasing importance. In particular the field of Natural Language Processing (NLP) has been undergoing a revolution in recent years. The training of ever larger language models, such as GPT-3, demands large HPC resources and has the potential to greatly impact everyday technology. The OpenGPT-X project was established in 2022 and aims to not leave this field to large tech companies but to provide an open, publicly funded alternative based on European values. The Jülich Supercomputing Centre is a consortium partner providing HPC infrastructure for the pre-training of the models. We research the optimization potential in the training process for example by using novel accelerator architectures.
T2  - 14th JLESC Workshop
CY  - 28 Sep 2022 - 30 Sep 2022, Urbana-Champaign (USA)
Y2  - 28 Sep 2022 - 30 Sep 2022
M2  - Urbana-Champaign, USA
LB  - PUB:(DE-HGF)24
UR  - https://juser.fz-juelich.de/record/910080
ER  -