000811621 001__ 811621
000811621 005__ 20250317091730.0
000811621 0247_ $$2URN$$aurn:nbn:de:0001-2016062000
000811621 0247_ $$2Handle$$a2128/11967
000811621 0247_ $$2ISSN$$a1868-8489
000811621 020__ $$a978-3-95806-152-1
000811621 037__ $$aFZJ-2016-04033
000811621 041__ $$aEnglish
000811621 1001_ $$0P:(DE-Juel1)132108$$aFrings, Wolfgang$$b0$$eCorresponding author$$gmale$$ufzj
000811621 245__ $$aEfficient Task-Local I/O Operations of Massively Parallel Applications$$f- 2016-04-26
000811621 260__ $$aJülich$$bForschungszentrum Jülich GmbH Zentralbibliothek, Verlag$$c2016
000811621 300__ $$axiv, 140 S.
000811621 3367_ $$2DataCite$$aOutput Types/Dissertation
000811621 3367_ $$2ORCID$$aDISSERTATION
000811621 3367_ $$2BibTeX$$aPHDTHESIS
000811621 3367_ $$02$$2EndNote$$aThesis
000811621 3367_ $$0PUB:(DE-HGF)11$$2PUB:(DE-HGF)$$aDissertation / PhD Thesis$$bphd$$mphd$$s1469605055_20814
000811621 3367_ $$2DRIVER$$adoctoralThesis
000811621 4900_ $$aSchriften des Forschungszentrums Jülich. IAS Series$$v30
000811621 502__ $$aRWTH Aachen, Diss., 2016$$bDr.$$cRWTH Aachen$$d2016$$o2016-04-26
000811621 520__ $$aApplications on current large-scale HPC systems use enormous numbers of processing elements for their computation and have access to large amounts of main memory for their data. Nevertheless, they still need file-system access to maintain program and application data persistently. Characteristic I/O patterns that produce a high load on the file system often occurduring access to checkpoint and restart files, which have to be frequently stored to allow the application to be restarted after program termination or system failure. On large-scale HPC systems with distributed memory, each application task will often perform such I/O individually by creating task-local file objects on the file system. At large scale, these I/O patterns impose substantial stress on the metadata management components of the I/O subsystem. For example, the simultaneous creation of thousands of task-local files in the same directory can cause delays of several minutes. Also at the startup of dynamically linked applications, such metadata contention occurs while searching for library files and induces a comparably high metadata load on the file system. Even mid-scale applications cause in such load scenarios startup delays of ten minutes or more. Therefore, dynamic linking and loading is nowadays not applied on large HPC systems, although dynamic linking has many advantages for managing large code bases. The reason for these limitations is that POSIX I/O and the dynamic loader are implemented as serial components of the operating system and do not take advantage of the parallel nature of the I/O operations. To avoid the above bottlenecks, this work describes two novel approaches for the integration of locality awareness (e.g., through aggregation or caching) into the serial I/O operations of parallel applications. The underlying methods are implemented in two tools, $\textit{SIONlib}$ and $\textit{Spindle}$, which exploit the knowledge of application parallelism to coordinate access to file-system objects. In addition, the applied methods also use knowledge of the underlying I/O subsystem structure, the parallel file system configuration, and the network betweenHPC-system and I/O system to optimize application I/O. Both tools add layers between the parallel application and the POSIX-based standard interfaces of the operating system for I/O and dynamic loading, eliminating the need for modifying the underlying system software. SIONlib is already applied in several applications, including PEPC, muphi, and MP2C, to implement efficient checkpointing. In addition, SIONlib is integrated in the performance-analysis tools Scalasca and Score-P to efficiently store and read trace data. Latest benchmarks on the Blue Gene/Q in Jülich demonstrate that SIONlib solves the metadata problem at large scale by running efficiently up to 1.8 million tasks while maintaining high I/O bandwidths of 60-80% of file-system peak with a negligible file-creation time. The scalability of Spindle could be demonstrated by running the Pynamic benchmark, a proxy benchmark for a real application, on a cluster of Lawrence Livermore National Laboratory at large scale. The results show that the startup of dynamically linked applications is now feasible on more than 15000 tasks, whereas the overhead of Spindle is nearly constantly low. With SIONlib and Spindle, this work demonstrates how scalability of operating system components can be improved without modifying them and without changing the I/O patterns of applications. In this way, SIONlib and Spindle represent prototype implementations of functionality needed by next-generation runtime systems.
000811621 536__ $$0G:(DE-HGF)POF3-511$$a511 - Computational Science and Mathematical Methods (POF3-511)$$cPOF3-511$$fPOF III$$x0
000811621 536__ $$0G:(DE-Juel-1)ATMLAO$$aATMLAO - ATML Application Optimization and User Service Tools (ATMLAO)$$cATMLAO$$x1
000811621 650_7 $$xDiss.
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.pdf$$yOpenAccess
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.gif?subformat=icon$$xicon$$yOpenAccess
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.jpg?subformat=icon-180$$xicon-180$$yOpenAccess
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.jpg?subformat=icon-640$$xicon-640$$yOpenAccess
000811621 8564_ $$uhttps://juser.fz-juelich.de/record/811621/files/IAS_Series_30.pdf?subformat=pdfa$$xpdfa$$yOpenAccess
000811621 909CO $$ooai:juser.fz-juelich.de:811621$$pdnbdelivery$$pVDB$$pdriver$$purn$$popen_access$$popenaire
000811621 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000811621 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
000811621 9141_ $$y2016
000811621 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132108$$aForschungszentrum Jülich$$b0$$kFZJ
000811621 9131_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vComputational Science and Mathematical Methods$$x0
000811621 920__ $$lyes
000811621 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000811621 980__ $$aphd
000811621 980__ $$aVDB
000811621 980__ $$aUNRESTRICTED
000811621 980__ $$aI:(DE-Juel1)JSC-20090406
000811621 9801_ $$aFullTexts