001     861601
005     20250314084118.0
020 _ _ |a 978-3-030-11987-4
024 7 _ |a 10.1007/978-3-030-11987-4_6
|2 doi
024 7 _ |a 2128/21896
|2 Handle
037 _ _ |a FZJ-2019-02051
100 1 _ |a Schlütter, Marc
|0 P:(DE-Juel1)142180
|b 0
|e Corresponding author
111 2 _ |a 11th International Workshop on Parallel Tools for High Performance Computing
|c Dresden
|d 2017-09-11 - 2017-09-12
|w Germany
245 _ _ |a SCIPHI Score-P and Cube Extensions for Intel Phi
260 _ _ |a Cham
|c 2019
|b Springer International Publishing
295 1 0 |a Tools for High Performance Computing 2017
300 _ _ |a 85-104
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1553601793_29279
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
520 _ _ |a The Knights Landing processors offers unique features with regards to memory hierarchy and vectorization capabilities. To improve tool support within these two areas, we present extensions to the Score-P measurement infrastructure and the Cube report explorer. With the Knights Landing edition, Intel introduced a new memory architecture, utilizing two types of memory, MCDRAM and DDR4 SDRAM. To assist the user in the decision where to place data structures, we introduce a MCDRAM candidate metric to the Cube report explorer. In addition we track all MCDRAM allocations through the hbwmalloc interface, providing memory metrics like leaked memory or the high-water mark on a per-region basis, as already known for the ubiquitous malloc/free. A Score-P metric plugin that records memory statistics via numastat on a per process level enables a timeline analysis using the Vampir toolset. To get the best performance out of , the large vector processing units need to be utilized effectively. The ratio between computation and data access and the vector processing unit (VPU) intensity are introduced as metrics to identify vectorization candidates on a per-region basis. The Portable Hardware Locality (hwloc) Broquedis et al. (hwloc: a generic framework for managing hardware affinities in hpc applications, 2010 [2]) library allows us to visualize the distribution of the KNL-specific performance metrics within the Cube report explorer, taking the hardware topology consisting of processor tiles and cores into account.
536 _ _ |a 511 - Computational Science and Mathematical Methods (POF3-511)
|0 G:(DE-HGF)POF3-511
|c POF3-511
|f POF III
|x 0
536 _ _ |0 G:(DE-Juel-1)ATMLPP
|a ATMLPP - ATML Parallel Performance (ATMLPP)
|c ATMLPP
|x 1
588 _ _ |a Dataset connected to CrossRef Book
700 1 _ |a Feld, Christian
|0 P:(DE-Juel1)132244
|b 1
700 1 _ |a Saviankou, Pavel
|0 P:(DE-Juel1)132249
|b 2
700 1 _ |a Knobloch, Michael
|0 P:(DE-Juel1)132163
|b 3
700 1 _ |a Hermanns, Marc-André
|0 P:(DE-Juel1)168253
|b 4
700 1 _ |a Mohr, Bernd
|0 P:(DE-Juel1)132199
|b 5
773 _ _ |a 10.1007/978-3-030-11987-4_6
856 4 _ |u https://link.springer.com/chapter/10.1007%2F978-3-030-11987-4_6
856 4 _ |u https://juser.fz-juelich.de/record/861601/files/2019_Book_ToolsForHighPerformanceComputi.pdf
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/861601/files/2019_Book_ToolsForHighPerformanceComputi.pdf?subformat=pdfa
|x pdfa
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:861601
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)142180
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 0
|6 P:(DE-Juel1)142180
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)132244
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 1
|6 P:(DE-Juel1)132244
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)132249
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 2
|6 P:(DE-Juel1)132249
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)132163
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 3
|6 P:(DE-Juel1)132163
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)168253
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 4
|6 P:(DE-Juel1)168253
910 1 _ |a JARA-HPC
|0 I:(DE-HGF)0
|b 4
|6 P:(DE-Juel1)168253
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 5
|6 P:(DE-Juel1)132199
910 1 _ |a JSC
|0 I:(DE-HGF)0
|b 5
|6 P:(DE-Juel1)132199
910 1 _ |a JARA-HPC
|0 I:(DE-HGF)0
|b 5
|6 P:(DE-Juel1)132199
913 1 _ |a DE-HGF
|b Key Technologies
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|2 G:(DE-HGF)POF3-500
|v Computational Science and Mathematical Methods
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|l Supercomputing & Big Data
914 1 _ |y 2019
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
920 1 _ |0 I:(DE-82)080012_20140620
|k JARA-HPC
|l JARA - HPC
|x 1
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a I:(DE-82)080012_20140620
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21