001     911978
005     20230310131330.0
024 7 _ |a 2128/32852
|2 Handle
037 _ _ |a FZJ-2022-05208
041 _ _ |a English
100 1 _ |a Baumann, Thomas
|0 P:(DE-Juel1)190575
|b 0
|e Corresponding author
|u fzj
111 2 _ |a 11th Parallel-in-Time Workshop
|c Marseilles
|d 2022-07-11 - 2022-07-15
|w France
245 _ _ |a Resilience in Spectral Deferred Corrections
260 _ _ |c 2022
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a CONFERENCE_POSTER
|2 ORCID
336 7 _ |a Output Types/Conference Poster
|2 DataCite
336 7 _ |a Poster
|b poster
|m poster
|0 PUB:(DE-HGF)24
|s 1669702263_13478
|2 PUB:(DE-HGF)
|x After Call
520 _ _ |a Advancement in computational speed is nowadays gained by using more processing units rather than faster ones. Faults in the processing units caused by numerous sources including radiation and aging have been neglected in the past. However, the increasing size of HPC machines makes them more susceptible and it is important to develop a resilience strategy to avoid losing millions of CPU hours. Parallel-in-time methods target the very largest of computers and are hence required to come with algorithm-based fault tolerance. We look here at spectral deferred corrections (SDC), which is a time marching scheme that is at the heart of parallel-in-time methods such as PFASST. Due to its iterative nature, there is ample opportunity to plug in computationally inexpensive fault tolerance schemes, many of which are also easy to implement. We experimentally examine the capability of various strategies to recover from single bit flips both for serial SDC as well as a small-scale parallel-in-time version with diagonal preconditioners.
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|x 0
|f POF IV
536 _ _ |a DFG project 450829162 - Raum-Zeit-parallele Simulation multimodale Energiesystemen (450829162)
|0 G:(GEPRIS)450829162
|c 450829162
|x 1
536 _ _ |a TIME-X - TIME parallelisation: for eXascale computing and beyond (955701)
|0 G:(EU-Grant)955701
|c 955701
|x 2
|f H2020-JTI-EuroHPC-2019-1
700 1 _ |a Götschel, Sebastian
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Lunet, Thibaut
|0 P:(DE-HGF)0
|b 2
700 1 _ |a Ruprecht, Daniel
|0 P:(DE-HGF)0
|b 3
700 1 _ |a Schöbel, Ruth
|0 P:(DE-Juel1)169281
|b 4
|u fzj
700 1 _ |a Speck, Robert
|0 P:(DE-Juel1)132268
|b 5
|u fzj
856 4 _ |u https://juser.fz-juelich.de/record/911978/files/Poster.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:911978
|p openaire
|p open_access
|p VDB
|p driver
|p ec_fundedresources
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)190575
910 1 _ |a TUHH
|0 I:(DE-HGF)0
|b 1
|6 P:(DE-HGF)0
910 1 _ |a TUHH
|0 I:(DE-HGF)0
|b 2
|6 P:(DE-HGF)0
910 1 _ |a TUHH
|0 I:(DE-HGF)0
|b 3
|6 P:(DE-HGF)0
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)169281
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 5
|6 P:(DE-Juel1)132268
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 0
914 1 _ |y 2022
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a poster
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21