Contribution to a conference proceedings/Contribution to a book FZJ-2016-04722

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Extreme-scaling applications en route to exascale

 ;  ;

2016
ACM Press New York

Proceedings of the Exascale Applications and Software Conference 2016
Exascale Applications and Software Conference 2016, EASC'16, StockholmStockholm, Sweden, 26 Apr 2016 - 29 Apr 20162016-04-262016-04-29
New York : ACM Press 10 pp. () [10.1145/2938615.2938616]

This record in other databases:

Please use a persistent id in citations:   doi:

Abstract: Feedback from the previous year's very successful workshop motivated the organisation of a three-day workshop from 1 to 3 February 2016, during which the 28-rack JUQUEEN BlueGene/Q system with 458 752 cores was reserved for over 50 hours. Eight international code teams were selected to use this opportunity to investigate and improve their application scalability, assisted by staff from JSC Simulation Laboratories and Cross-Sectional Teams. Ultimately seven teams had codes successfully run on the full JUQUEEN system. Strong scalability demonstrated by Code Saturne and Seven-League Hydro, both using 4 OpenMP threads for 16 MPI processes on each compute node for a total of 1 835 008 threads, qualify them for High-Q Club membership. Existing members CIAO and iFETI were able to show that they had additional solvers which also scaled acceptably. Furthermore, large-scale in-situ interactive visualisation was demonstrated with a CIAO simulation using 458 752 MPI processes running on 28 racks coupled via JUSITU to VisIt. The two adaptive mesh refinement utilities, ICI and p4est, showed that they could respectively scale to run with 458 752 and 971 504 MPI ranks, but both encountered problems loading large meshes. Parallel file I/O issues also hindered large-scale executions of PFLOTRAN. Poor performance of a NEST-import module which loaded and connected 1.9 TiB of neuron and synapse data was tracked down to an internal data-structure mismatch with the HDF5 file objects that prevented use of MPI collective file reading, which when rectified is expected to enable large-scale neuronal network simulations.Comparative analysis is provided to the 25 codes in the High-Q Club at the start of 2016, which includes five codes that qualified from the previous workshop. Despite more mixed results, we learnt more about application file I/O limitations and inefficiencies which continue to be the primary inhibitor to large-scale simulations.


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 511 - Computational Science and Mathematical Methods (POF3-511) (POF3-511)
  2. ATMLPP - ATML Parallel Performance (ATMLPP) (ATMLPP)
  3. ATMLAO - ATML Application Optimization and User Service Tools (ATMLAO) (ATMLAO)

Appears in the scientific report 2016
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Events > Contributions to a conference proceedings
Document types > Books > Contribution to a book
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2016-09-15, last modified 2025-03-17