001034059 001__ 1034059
001034059 005__ 20250109211141.0
001034059 037__ $$aFZJ-2024-06880
001034059 1001_ $$0P:(DE-Juel1)192254$$aPenke, Carolin$$b0$$eCorresponding author$$ufzj
001034059 1112_ $$aWomen in Data Science Conference Chemnitz$$cChemnitz$$d2024-06-06 - 2024-06-07$$wGermany
001034059 245__ $$aAn Introduction to Large Language Models
001034059 260__ $$c2024
001034059 3367_ $$033$$2EndNote$$aConference Paper
001034059 3367_ $$2DataCite$$aOther
001034059 3367_ $$2BibTeX$$aINPROCEEDINGS
001034059 3367_ $$2DRIVER$$aconferenceObject
001034059 3367_ $$2ORCID$$aLECTURE_SPEECH
001034059 3367_ $$0PUB:(DE-HGF)6$$2PUB:(DE-HGF)$$aConference Presentation$$bconf$$mconf$$s1736416151_16756$$xPlenary/Keynote
001034059 520__ $$aLarge Language Models (LLMs) have revolutionized the field of artificial intelligence, enabling advanced text generation and understanding. This talk provides a concise overview of LLMs, focusing on their development, architecture, and implementation. We explain key concepts, and give details on the backbone of modern LLMs: the transformer architecture and its innovative attention mechanism. To be able to train these models on supercomputers, advanced parallelization techniques are needed. Recent advancements and promising trends are identified. Through the lens of the OpenGPT-X project, this presentation will highlight the collaborative efforts in developing multilingual, open-source LLMs.
001034059 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001034059 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x1
001034059 536__ $$0G:(DE-Juel-1)68GX21007F$$aOpenGPT-X - Aufbau eines Gaia-X Knotens für große KI-Sprachmodelle und innovative Sprachapplikations-Services; Teilvorhaben: Optimierung und Skalierung auf großen HPC-Systemen (68GX21007F)$$c68GX21007F$$x2
001034059 536__ $$0G:(DE-Juel-1)JuWinHPC$$aJuWinHPC - Jülich Women in HPC (JuWinHPC)$$cJuWinHPC$$x3
001034059 8564_ $$uhttps://juser.fz-juelich.de/record/1034059/files/IntroToLLMS_WiDS.pdf$$yRestricted
001034059 909CO $$ooai:juser.fz-juelich.de:1034059$$pVDB
001034059 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)192254$$aForschungszentrum Jülich$$b0$$kFZJ
001034059 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001034059 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x1
001034059 9141_ $$y2024
001034059 920__ $$lyes
001034059 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001034059 980__ $$aconf
001034059 980__ $$aVDB
001034059 980__ $$aI:(DE-Juel1)JSC-20090406
001034059 980__ $$aUNRESTRICTED