001019996 001__ 1019996
001019996 005__ 20240226075235.0
001019996 037__ $$aFZJ-2023-05811
001019996 1001_ $$0P:(DE-Juel1)174573$$aComito, Claudia$$b0$$ufzj
001019996 1112_ $$aCS & Physics Meet-Up by Lamarr & B3D$$cTU Dortmund$$d2023-11-29 - 2023-12-01$$wGermany
001019996 245__ $$aHeat: accelerating massive data processing in Python
001019996 260__ $$c2023
001019996 3367_ $$033$$2EndNote$$aConference Paper
001019996 3367_ $$2BibTeX$$aINPROCEEDINGS
001019996 3367_ $$2DRIVER$$aconferenceObject
001019996 3367_ $$2ORCID$$aCONFERENCE_POSTER
001019996 3367_ $$2DataCite$$aOutput Types/Conference Poster
001019996 3367_ $$0PUB:(DE-HGF)24$$2PUB:(DE-HGF)$$aPoster$$bposter$$mposter$$s1704433621_29017$$xOutreach
001019996 520__ $$aManipulating and processing massive data sets is challenging. In astrophysics as in the vast majority of research communities, the standard approach involves breaking up and analyzing data in smaller chunks, a process that is both inefficient and prone to errors. The problem is exacerbated on GPUs, because of the smaller available memory.Popular solutions to distribute NumPy/SciPy computations are based on task parallelism, introducing significant runtime overhead, complicating implementation, and often limiting GPU support to one vendor.This poster illustrates an alternative based on data parallelism instead. The open-source library Heat [1, 2] builds on PyTorch and mpi4py to simplify porting of NumPy/SciPy-based code to GPU (CUDA, ROCm, including multi-GPU, multi-node clusters). Under the hood, Heat distributes massive memory-intensive operations over multi-node resources via MPI communication. From a user's perspective, Heat can be used seamlessly in the Python array ecosystem.   Supported features:- distributed (multi-GPU) I/O from shared memory- easy distribution of memory-intensive operations in existing code (e.g. matrix multiplication)- interoperability within the Python array ecosystem: Heat as a backend for your massive array manipulations, statistics, signal processing, machine learning...- transparent parallelism: prototype on your laptop, run the same code on HPC cluster.I'll also touch upon Heat's current implementation roadmap, and possible paths to collaboration.[1] https://github.com/helmholtz-analytics/heat[2] M. Götz et al., "HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics," 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 2020, pp. 276-287, doi: 10.1109/BigData50022.2020.9378050.
001019996 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001019996 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1
001019996 536__ $$0G:(DE-Juel1)Helmholtz-SLNS$$aSLNS - SimLab Neuroscience (Helmholtz-SLNS)$$cHelmholtz-SLNS$$x2
001019996 7001_ $$0P:(DE-HGF)0$$aHoppe, Fabian$$b1
001019996 7001_ $$0P:(DE-HGF)0$$aGötz, Markus$$b2
001019996 7001_ $$0P:(DE-HGF)0$$aGutiérrez Hermosillo Muriedas, Juan Pedro$$b3
001019996 7001_ $$0P:(DE-Juel1)132123$$aHagemeier, Björn$$b4$$ufzj
001019996 7001_ $$0P:(DE-HGF)0$$aKnechtges, Philipp$$b5
001019996 7001_ $$0P:(DE-Juel1)129347$$aKrajsek, Kai$$b6$$ufzj
001019996 7001_ $$0P:(DE-HGF)0$$aRüttgers, Alexander$$b7
001019996 7001_ $$0P:(DE-HGF)0$$aStreit, Achim$$b8
001019996 7001_ $$0P:(DE-Juel1)178977$$aTarnawa, Michael$$b9$$ufzj
001019996 909CO $$ooai:juser.fz-juelich.de:1019996$$pVDB
001019996 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)174573$$aForschungszentrum Jülich$$b0$$kFZJ
001019996 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132123$$aForschungszentrum Jülich$$b4$$kFZJ
001019996 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)129347$$aForschungszentrum Jülich$$b6$$kFZJ
001019996 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)178977$$aForschungszentrum Jülich$$b9$$kFZJ
001019996 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001019996 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1
001019996 9141_ $$y2023
001019996 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001019996 980__ $$aposter
001019996 980__ $$aVDB
001019996 980__ $$aI:(DE-Juel1)JSC-20090406
001019996 980__ $$aUNRESTRICTED