Assessing the State of Autovectorization Support based on SVE

Brank, Bine; Pleiter, Dirk

doi:10.1109/CLUSTER51413.2022.00073

Items
Marc 21

001			911147
005			20230224084249.0
024	7	_	\|a 10.1109/CLUSTER51413.2022.00073 \|2 doi
024	7	_	\|a 2128/32799 \|2 Handle
024	7	_	\|a WOS:000920273100058 \|2 WOS
037	_	_	\|a FZJ-2022-04463
100	1	_	\|a Brank, Bine \|0 P:(DE-Juel1)174207 \|b 0 \|e Corresponding author
111	2	_	\|a 2022 IEEE International Conference on Cluster Computing (CLUSTER) \|c Heidelberg \|d 2022-09-05 - 2022-09-08 \|w Germany
245	_	_	\|a Assessing the State of Autovectorization Support based on SVE
260	_	_	\|c 2022 \|b IEEE
300	_	_	\|a 556–562
336	7	_	\|a CONFERENCE_PAPER \|2 ORCID
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a Output Types/Conference Paper \|2 DataCite
336	7	_	\|a Contribution to a conference proceedings \|b contrib \|m contrib \|0 PUB:(DE-HGF)8 \|s 1669378185_20905 \|2 PUB:(DE-HGF)
520	_	_	\|a So-called SIMD instructions, which trigger operations that process in each clock cycle a data tuple, have become widespread in modern processor architectures. In particular, processors for high-performance computing (HPC) systems rely on this additional level of parallelism to reach a high throughput of arithmetic operations. Leveraging these SIMD instructions can still be challenging for application software developers. This challenge has become simpler due to a compiler technique called auto-vectorization. In this paper, we explore the current state of auto-vectorization capabilities using state-of-the-art compilers using a recent extension of the Arm instruction set architecture, called SVE. We measure the performance gains on a recent processor architecture supporting SVE, namely the Fujitsu A64FX processor.
536	_	_	\|a 5122 - Future Computing & Big Data Systems (POF4-512) \|0 G:(DE-HGF)POF4-5122 \|c POF4-512 \|f POF IV \|x 0
536	_	_	\|a Mont-Blanc 2020 - Mont-Blanc 2020, European scalable, modular and power efficient HPC processor (779877) \|0 G:(EU-Grant)779877 \|c 779877 \|f H2020-ICT-2017-1 \|x 1
588	_	_	\|a Dataset connected to CrossRef Conference
700	1	_	\|a Pleiter, Dirk \|0 P:(DE-HGF)0 \|b 1
773	_	_	\|a 10.1109/CLUSTER51413.2022.00073
856	4	_	\|u https://juser.fz-juelich.de/record/911147/files/EAHPC_2022_SVE_Vectorisation.pdf \|y OpenAccess
909	C	O	\|o oai:juser.fz-juelich.de:911147 \|p openaire \|p open_access \|p driver \|p VDB \|p ec_fundedresources \|p dnbdelivery
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)174207
913	1	_	\|a DE-HGF \|b Key Technologies \|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action \|1 G:(DE-HGF)POF4-510 \|0 G:(DE-HGF)POF4-512 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Supercomputing & Big Data Infrastructures \|9 G:(DE-HGF)POF4-5122 \|x 0
914	1	_	\|y 2022
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a contrib
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	1	_	\|a FullTexts

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help