OpenGPT-X – Training Large Language Models on HPC Systems

John, Chelsea Maria; Penke, Carolin; Herten, Andreas; Ebert, Jan; Kesselheim, Stefan

doi:10.34732/XDVBLG-SVNDMJ

%0 Conference Paper
%A John, Chelsea Maria
%A Ebert, Jan
%A Penke, Carolin
%A Kesselheim, Stefan
%A Herten, Andreas
%T OpenGPT-X – Training Large Language Models on HPC Systems
%M FZJ-2023-02173
%D 2023
%X OpenGPT-X is a German initiative to build and train large language models (LLMs). The project aims at providing an open alternative to LLMs which are up to now private property, along with a platform for researching methods to train multilingual LLMs efficiently. For that, the project not only utilizes the state-of-the-art in training models but also incorporates new methods, algorithms, and tools. Models trained within the project will be published and used for pilot language services by industry partners. In addition, further applications are expected through Gaia-X federation. LLMs can scale to more than 175 Billion parameters, which requires efficient usage of supercomputers like JUWELS Booster. Especially in the light of the recent successes of ChatGPT, our work clearly indicates that the infrastructure of supercomputing centres and initiatives aiming to provide resources to the public can have a large societal impact. This poster outlines the initial progress and future work of the project from Jülich Supercomputing Center (JSC).
%B ISC High Performance 2023
%C 21 May 2023 - 25 May 2023, Hamburg (Germany)
Y2 21 May 2023 - 25 May 2023
M2 Hamburg, Germany
%K HPC (Other)
%K GPU (Other)
%K OpenGPTX (Other)
%F PUB:(DE-HGF)24
%9 Poster
%R 10.34732/XDVBLG-SVNDMJ
%U https://juser.fz-juelich.de/record/1007707

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help