Scalasca support for MPI+OpenMP parallel applications on large-scale HPC systems based on Intel Xeon Phi

Wylie, Brian J. N.; Frings, Wolfgang
doi:10.1145/2484762.2484777
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Wylie:136584,
      author       = {Wylie, Brian J. N. and Frings, Wolfgang},
      title        = {{S}calasca support for {MPI}+{O}pen{MP} parallel
                      applications on large-scale {HPC} systems based on {I}ntel
                      {X}eon {P}hi},
      address      = {New York, New York, USA},
      publisher    = {ACM Press},
      reportid     = {FZJ-2013-03370},
      pages        = {8},
      year         = {2013},
      comment      = {XSEDE '13 Proceedings of the Conference on Extreme Science
                      and Engineering Discovery Environment: Gateway to Discovery},
      booktitle     = {XSEDE '13 Proceedings of the
                       Conference on Extreme Science and
                       Engineering Discovery Environment:
                       Gateway to Discovery},
      abstract     = {Intel Xeon Phi coprocessors based on the Many Integrated
                      Core (MIC) architecture are starting to appear in HPC
                      systems, with Stampede being a prominent example available
                      within the XSEDE cyber-infrastructure. Porting MPI and
                      OpenMP applications to such systems is often no more than
                      simple recompilation, however, execution performance needs
                      to be carefully analyzed and tuned to effectively exploit
                      their unique capabilities. For performance measurement and
                      analysis tools, the variety of execution modes need to be
                      supported in a consistent and convenient manner, and
                      especially execution configurations involving large numbers
                      of compute nodes each with several multicore host processors
                      and many-core coprocessors. Early experience using the
                      open-source Scalasca toolset for runtime summarization and
                      automatic trace analysis with the NPB BT-MZ MPI+OpenMP
                      parallel application on Stampede is reported, along with
                      discussion of on-going and future work.},
      month         = {Jul},
      date          = {2013-07-22},
      organization  = {Conference on Extreme Science and
                       Engineering Discovery Environment:
                       Gateway to Discovery, San Diego,
                       California (USA), 22 Jul 2013 - 25 Jul
                       2013},
      cin          = {JSC},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {411 - Computational Science and Mathematical Methods
                      (POF2-411) / DEEP - Dynamical Exascale Entry Platform
                      (287530) / ATMLPP - ATML Parallel Performance (ATMLPP)},
      pid          = {G:(DE-HGF)POF2-411 / G:(EU-Grant)287530 /
                      G:(DE-Juel-1)ATMLPP},
      typ          = {PUB:(DE-HGF)8},
      doi          = {10.1145/2484762.2484777},
      url          = {https://juser.fz-juelich.de/record/136584},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help