Conference Presentation (Plenary/Keynote) FZJ-2024-06880

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
An Introduction to Large Language Models



2024

Women in Data Science Conference Chemnitz, ChemnitzChemnitz, Germany, 6 Jun 2024 - 7 Jun 20242024-06-062024-06-07

Abstract: Large Language Models (LLMs) have revolutionized the field of artificial intelligence, enabling advanced text generation and understanding. This talk provides a concise overview of LLMs, focusing on their development, architecture, and implementation. We explain key concepts, and give details on the backbone of modern LLMs: the transformer architecture and its innovative attention mechanism. To be able to train these models on supercomputers, advanced parallelization techniques are needed. Recent advancements and promising trends are identified. Through the lens of the OpenGPT-X project, this presentation will highlight the collaborative efforts in developing multilingual, open-source LLMs.


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  2. 5122 - Future Computing & Big Data Systems (POF4-512) (POF4-512)
  3. OpenGPT-X - Aufbau eines Gaia-X Knotens für große KI-Sprachmodelle und innovative Sprachapplikations-Services; Teilvorhaben: Optimierung und Skalierung auf großen HPC-Systemen (68GX21007F) (68GX21007F)
  4. JuWinHPC - Jülich Women in HPC (JuWinHPC) (JuWinHPC)

Appears in the scientific report 2024
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Präsentationen > Konferenzvorträge
Workflowsammlungen > Öffentliche Einträge
Institutssammlungen > JSC
Publikationsdatenbank

 Datensatz erzeugt am 2024-12-11, letzte Änderung am 2025-01-09


Restricted:
Volltext herunterladen PDF
Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)