Determining parallel application execution efficiency &amp; scaling using the POP methodology

Garcia-Gasulla, Marta; Wylie, Brian J. N.; Mendez, Sandra; Visser, Anke
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@MISC{GarciaGasulla:1028928,
      author       = {Garcia-Gasulla, Marta and Mendez, Sandra and Visser, Anke
                      and Wylie, Brian J. N.},
      title        = {{D}etermining parallel application execution efficiency
                      $\&$ scaling using the {POP} methodology},
      reportid     = {FZJ-2024-04872},
      year         = {2024},
      abstract     = {HPC application developers encounter significant challenges
                      getting their codes to run correctly on leadership computer
                      systems consisting of large numbers of interconnected
                      multi-socket multicore processor nodes often with attached
                      accelerator devices. They also need effective tools and
                      methods to track and assess their codes’ execution
                      performance as they aim to get ready for production on
                      current or prospective exascale computer systems. This
                      tutorial presents the methodology developed and applied over
                      several years within the EU HPC Centre of Excellence
                      Performance Optimisation and Productivity (POP). Its focus
                      is the hierarchy of execution efficiency and scaling metrics
                      that identify the most critical issues and quantify
                      potential benefits of remedies. The metrics can be readily
                      compared and determined by a variety of tools for
                      applications in any language employing standard MPI,
                      OpenMP/OpenACC and other multi-threading and offload
                      paradigms. Using their own notebook computers, tutorial
                      participants will follow exercises with widely-deployed
                      open-source tools and provided performance measurements of
                      actual HPC application executions (ranging from CFD to
                      neuroscience), preparing them to locate and diagnose
                      efficiency and scalability issues in their own parallel
                      application codes.},
      month         = {May},
      date          = {2024-05-12},
      organization  = {ISC High Performance, Hamburg
                       (Germany), 12 May 2024 - 12 May 2024},
      subtyp        = {After Call},
      cin          = {JSC},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs)
                      and Research Groups (POF4-511) / POP3 - Performance
                      Optimisation and Productivity 3 (101143931) / ATMLPP - ATML
                      Parallel Performance (ATMLPP)},
      pid          = {G:(DE-HGF)POF4-5112 / G:(EU-Grant)101143931 /
                      G:(DE-Juel-1)ATMLPP},
      typ          = {PUB:(DE-HGF)17},
      doi          = {10.34734/FZJ-2024-04872},
      url          = {https://juser.fz-juelich.de/record/1028928},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help