000910080 001__ 910080
000910080 005__ 20250822121514.0
000910080 0247_ $$2Handle$$a2128/32006
000910080 037__ $$aFZJ-2022-03599
000910080 041__ $$aEnglish
000910080 1001_ $$0P:(DE-Juel1)192254$$aPenke, Carolin$$b0$$eCorresponding author$$ufzj
000910080 1112_ $$a14th JLESC Workshop$$cUrbana-Champaign$$d2022-09-28 - 2022-09-30$$wUSA
000910080 245__ $$aOpenGPT-X - Training Large Language Models on HPC Systems
000910080 260__ $$c2022
000910080 3367_ $$033$$2EndNote$$aConference Paper
000910080 3367_ $$2BibTeX$$aINPROCEEDINGS
000910080 3367_ $$2DRIVER$$aconferenceObject
000910080 3367_ $$2ORCID$$aCONFERENCE_POSTER
000910080 3367_ $$2DataCite$$aOutput Types/Conference Poster
000910080 3367_ $$0PUB:(DE-HGF)24$$2PUB:(DE-HGF)$$aPoster$$bposter$$mposter$$s1665055854_11828$$xAfter Call
000910080 520__ $$aArtificial neural networks represent an HPC workload with increasing importance. In particular the field of Natural Language Processing (NLP) has been undergoing a revolution in recent years. The training of ever larger language models, such as GPT-3, demands large HPC resources and has the potential to greatly impact everyday technology. The OpenGPT-X project was established in 2022 and aims to not leave this field to large tech companies but to provide an open, publicly funded alternative based on European values. The Jülich Supercomputing Centre is a consortium partner providing HPC infrastructure for the pre-training of the models. We research the optimization potential in the training process for example by using novel accelerator architectures.
000910080 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
000910080 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x1
000910080 7001_ $$0P:(DE-Juel1)187395$$aJohn, Chelsea Maria$$b1$$ufzj
000910080 7001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b2$$ufzj
000910080 7001_ $$0P:(DE-Juel1)187002$$aEbert, Jan$$b3$$ufzj
000910080 7001_ $$0P:(DE-Juel1)185654$$aKesselheim, Stefan$$b4$$ufzj
000910080 7001_ $$0P:(DE-Juel1)142361$$aSuarez, Estela$$b5$$ufzj
000910080 8564_ $$uhttps://juser.fz-juelich.de/record/910080/files/OpenGPTX-Poster.pdf$$yOpenAccess
000910080 909CO $$ooai:juser.fz-juelich.de:910080$$pdriver$$pVDB$$popen_access$$popenaire
000910080 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000910080 9141_ $$y2022
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)192254$$aForschungszentrum Jülich$$b0$$kFZJ
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)187395$$aForschungszentrum Jülich$$b1$$kFZJ
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b2$$kFZJ
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)187002$$aForschungszentrum Jülich$$b3$$kFZJ
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)185654$$aForschungszentrum Jülich$$b4$$kFZJ
000910080 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)142361$$aForschungszentrum Jülich$$b5$$kFZJ
000910080 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
000910080 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000910080 980__ $$aOPENSCIENCE
000910080 9801_ $$aFullTexts
000910080 980__ $$aposter
000910080 980__ $$aVDB
000910080 980__ $$aUNRESTRICTED
000910080 980__ $$aI:(DE-Juel1)JSC-20090406