001     1028952
005     20240722202104.0
024 7 _ |a 10.34734/FZJ-2024-04892
|2 datacite_doi
037 _ _ |a FZJ-2024-04892
041 _ _ |a English
100 1 _ |a Saglam, Berk
|0 P:(DE-Juel1)200201
|b 0
|e Corresponding author
|u fzj
245 _ _ |a Heterogeneous Memory Aware Prefetching on High Performance Arm Processors
|f - 2024-04-18
260 _ _ |c 2024
300 _ _ |a 142
336 7 _ |a Output Types/Supervised Student Publication
|2 DataCite
336 7 _ |a Thesis
|0 2
|2 EndNote
336 7 _ |a MASTERSTHESIS
|2 BibTeX
336 7 _ |a masterThesis
|2 DRIVER
336 7 _ |a Master Thesis
|b master
|m master
|0 PUB:(DE-HGF)19
|s 1721622217_29528
|2 PUB:(DE-HGF)
336 7 _ |a SUPERVISED_STUDENT_PUBLICATION
|2 ORCID
502 _ _ |a Masterarbeit, Rheinische Friedrich-Wilhelms-Universität Bonn, 2024
|c Rheinische Friedrich-Wilhelms-Universität Bonn
|b Masterarbeit
|d 2024
|o 2024-04-18
520 _ _ |a Modern computing often sees up to 80% of computation time spent on data retrieval,emphasizing the importance of prefetching for enhancing CPU data delivery speeds bymoving data from slower storage to faster caches. Balancing timeliness and aggressivenessis crucial for reducing access times. Utilizing heterogeneous memory, in this contextHBM2 and DDR5, serve different roles due to their bandwidth and capacity trade-offs, underscoring the need for balanced memory management and awareness whileprefetching.This work focuses on developing prefetching strategies for heterogeneous memoryconfigurations in high-performance Arm processors, targeting a system architecturecomprising 20 cores, with 16 cores dedicated to HBM2 and 4 cores dedicated to DDR5memory. The primary objective is to reduce latency and improve system performanceby introducing two innovative optimization strategies for prefetching. These strategiesmeticulously balance timeliness and aggressiveness by adaptively tuning the prefetchdegree and distance. These strategies adapt dynamically to the specific memory type andavailable bandwidth with consideration of the prefetch accuracy, optimizing prefetchingoperations for enhanced performance and efficiency. The Prefetcher are integrated withthe L2 cache and its performance is rigorously assessed through Gem5 simulations. Theseevaluations compare the effectiveness of adaptive optimization strategies for both Streamand PC-based Stride Prefetchers, utilizing the Arm Neoverse V1 as the computationalmodel.Findings reveal adaptive prefetching is boosting system performance, notably with HBM2and DDR5 Memory, while facing memory contention on DDR5. This research advancesprefetching strategies with the understanding of heterogeneous memory, advocatingfurther exploration to enhance high-performance computing efficiency and performance.
536 _ _ |a 5122 - Future Computing & Big Data Systems (POF4-512)
|0 G:(DE-HGF)POF4-5122
|c POF4-512
|f POF IV
|x 0
856 4 _ |y OpenAccess
|u https://juser.fz-juelich.de/record/1028952/files/Saglam_MasterThesis_Bonn_2024.pdf
856 4 _ |y OpenAccess
|x icon
|u https://juser.fz-juelich.de/record/1028952/files/Saglam_MasterThesis_Bonn_2024.gif?subformat=icon
856 4 _ |y OpenAccess
|x icon-1440
|u https://juser.fz-juelich.de/record/1028952/files/Saglam_MasterThesis_Bonn_2024.jpg?subformat=icon-1440
856 4 _ |y OpenAccess
|x icon-180
|u https://juser.fz-juelich.de/record/1028952/files/Saglam_MasterThesis_Bonn_2024.jpg?subformat=icon-180
856 4 _ |y OpenAccess
|x icon-640
|u https://juser.fz-juelich.de/record/1028952/files/Saglam_MasterThesis_Bonn_2024.jpg?subformat=icon-640
909 C O |o oai:juser.fz-juelich.de:1028952
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)200201
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5122
|x 0
914 1 _ |y 2024
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a master
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21