Home > Publications database > Scaling data-intensive analytics with Heat: a Python library for massively-parallel array computing and machine learning > print |
001 | 1019998 | ||
005 | 20240105202147.0 | ||
037 | _ | _ | |a FZJ-2023-05813 |
100 | 1 | _ | |a Hoppe, Fabian |0 P:(DE-HGF)0 |b 0 |e Corresponding author |
111 | 2 | _ | |a Helmholtz AI Conference |c Hamburg |d 2023-06-12 - 2023-06-14 |w Germany |
245 | _ | _ | |a Scaling data-intensive analytics with Heat: a Python library for massively-parallel array computing and machine learning |
260 | _ | _ | |c 2023 |
336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
336 | 7 | _ | |a Other |2 DataCite |
336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
336 | 7 | _ | |a conferenceObject |2 DRIVER |
336 | 7 | _ | |a LECTURE_SPEECH |2 ORCID |
336 | 7 | _ | |a Conference Presentation |b conf |m conf |0 PUB:(DE-HGF)6 |s 1704436025_29017 |2 PUB:(DE-HGF) |x After Call |
520 | _ | _ | |a Manipulating and processing massive data sets is challenging. For the vast majority of research communities, those without a background in high-performance computing, the standard approach involves breaking up and analyzing data in smaller chunks, an inefficient and very prone-to-errors process.The Helmholtz Analytics Toolkit (Heat) library offers a solution to this problem by providing memory-distributed and hardware-accelerated array manipulation, data analytics, and machine learning algorithms in Python. Developed in collaboration by three institutions of the Helmholtz Association (KIT, FZJ, DLR), Heat: enables memory distribution of n-dimensional arrays, adopts PyTorch as process-local compute engine (hence supporting GPU-acceleration), provides memory-distributed (i.e., multi-node, multi-GPU) array operations and algorithms, optimizing asynchronous MPI-communication under the hood, and wraps functionalities in NumPy- or scikit-learn-like API to achieve porting of existing applications with minimal changes.In this presentation, we will provide an overview of the Heat library's features and capabilities and discuss its role in the ecosystem of distributed array computing and machine learning in Python. Additionally, we will highlight Heat's role as a platform for cross-discipline collaboration in data-intensive research, and address technical and operational challenges in Heat development. |
536 | _ | _ | |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5111 |c POF4-511 |f POF IV |x 0 |
536 | _ | _ | |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5112 |c POF4-511 |f POF IV |x 1 |
536 | _ | _ | |a SLNS - SimLab Neuroscience (Helmholtz-SLNS) |0 G:(DE-Juel1)Helmholtz-SLNS |c Helmholtz-SLNS |x 2 |
700 | 1 | _ | |a Comito, Claudia |0 P:(DE-Juel1)174573 |b 1 |
700 | 1 | _ | |a Gutiérrez Hermosillo Muriedas, Juan Pedro |0 P:(DE-HGF)0 |b 2 |
700 | 1 | _ | |a Götz, Markus |0 P:(DE-HGF)0 |b 3 |
700 | 1 | _ | |a Hagemeier, Björn |0 P:(DE-Juel1)132123 |b 4 |
700 | 1 | _ | |a Knechtges, Philipp |0 P:(DE-HGF)0 |b 5 |
700 | 1 | _ | |a Krajsek, Kai |0 P:(DE-Juel1)129347 |b 6 |
700 | 1 | _ | |a Rüttgers, Alexander |0 P:(DE-HGF)0 |b 7 |
700 | 1 | _ | |a Streit, Achim |0 P:(DE-HGF)0 |b 8 |
700 | 1 | _ | |a Tarnawa, Michael |0 P:(DE-Juel1)178977 |b 9 |
856 | 4 | _ | |u https://helmholtzai-conference2023.de/program/ |
909 | C | O | |o oai:juser.fz-juelich.de:1019998 |p VDB |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)174573 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 4 |6 P:(DE-Juel1)132123 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 6 |6 P:(DE-Juel1)129347 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 9 |6 P:(DE-Juel1)178977 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5111 |x 0 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5112 |x 1 |
914 | 1 | _ | |y 2023 |
920 | _ | _ | |l yes |
920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
980 | _ | _ | |a conf |
980 | _ | _ | |a VDB |
980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
980 | _ | _ | |a UNRESTRICTED |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|