Scalable timestamp synchronization for event traces of message-passing applications

Becker, D.; Wolf, F.; Rabenseifner, R.; Linford, J.

doi:10.1016/j.parco.2008.12.012

Items
Marc 21

001			6608
005			20250314084056.0
024	7	_	\|2 DOI \|a 10.1016/j.parco.2008.12.012
024	7	_	\|2 WOS \|a WOS:000272962600004
037	_	_	\|a PreJuSER-6608
041	_	_	\|a eng
082	_	_	\|a 004
084	_	_	\|2 WoS \|a Computer Science, Theory & Methods
100	1	_	\|a Becker, D. \|b 0 \|u FZJ \|0 P:(DE-Juel1)VDB62975
245	_	_	\|a Scalable timestamp synchronization for event traces of message-passing applications
260	_	_	\|a Amsterdam [u.a.] \|b North-Holland, Elsevier Science \|c 2009
300	_	_	\|a 595 - 607
336	7	_	\|a Journal Article \|0 PUB:(DE-HGF)16 \|2 PUB:(DE-HGF)
336	7	_	\|a Output Types/Journal article \|2 DataCite
336	7	_	\|a Journal Article \|0 0 \|2 EndNote
336	7	_	\|a ARTICLE \|2 BibTeX
336	7	_	\|a JOURNAL_ARTICLE \|2 ORCID
336	7	_	\|a article \|2 DRIVER
440	_	0	\|a Parallel Computing \|x 0167-8191 \|0 12681 \|y 12 \|v 35
500	_	_	\|a Record converted from VDB: 12.11.2012
520	_	_	\|a Event traces are helpful in understanding the performance behavior of message-passing applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks may render the analysis ineffective because inaccurate relative event timings may misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors. Although linear offset interpolation can restore consistency to some degree, time-dependent drifts and other inaccuracies may still disarrange the original succession of events - especially during longer runs. The controlled logical clock algorithm accounts for such violations in point-to-point communication by shifting message events in time as much as needed while trying to preserve the length of local intervals. In this article, we describe how the controlled logical clock is extended to collective communication to enable the correction of realistic message-passing traces. We present a parallel version of the algorithm scaling to more than thousand processes and evaluate its accuracy by showing that it eliminates inconsistent inter-process timings while preserving the length of local intervals. (C) 2009 Elsevier B.V. All rights reserved.
536	_	_	\|a Scientific Computing \|c P41 \|2 G:(DE-HGF) \|0 G:(DE-Juel1)FUEK411 \|x 0
536	_	_	\|0 G:(DE-Juel-1)ATMLPP \|a ATMLPP - ATML Parallel Performance (ATMLPP) \|c ATMLPP \|x 1
588	_	_	\|a Dataset connected to Web of Science
650	_	7	\|a J \|2 WoSType
653	2	0	\|2 Author \|a Performance analysis
653	2	0	\|2 Author \|a Event tracing
653	2	0	\|2 Author \|a Clock synchronization
700	1	_	\|a Rabenseifner, R. \|b 1 \|0 P:(DE-HGF)0
700	1	_	\|a Wolf, F. \|b 2 \|u FZJ \|0 P:(DE-Juel1)VDB1927
700	1	_	\|a Linford, J. \|b 3 \|0 P:(DE-HGF)0
773	_	_	\|a 10.1016/j.parco.2008.12.012 \|g Vol. 35, p. 595 - 607 \|p 595 - 607 \|q 35<595 - 607 \|0 PERI:(DE-600)1466340-5 \|t Parallel computing \|v 35 \|y 2009 \|x 0167-8191
856	7	_	\|u http://dx.doi.org/10.1016/j.parco.2008.12.012
909	C	O	\|o oai:juser.fz-juelich.de:6608 \|p VDB
913	1	_	\|k P41 \|v Scientific Computing \|l Supercomputing \|b Schlüsseltechnologien \|0 G:(DE-Juel1)FUEK411 \|x 0
914	1	_	\|y 2009
915	_	_	\|0 StatID:(DE-HGF)0010 \|a JCR/ISI refereed
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Centre \|g JSC \|x 0
920	1	_	\|0 I:(DE-82)080012_20140620 \|k JARA-HPC \|l Jülich Aachen Research Alliance - High-Performance Computing \|g JARA \|x 1
970	_	_	\|a VDB:(DE-Juel1)114965
980	_	_	\|a VDB
980	_	_	\|a ConvertedRecord
980	_	_	\|a journal
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	_	_	\|a I:(DE-82)080012_20140620
980	_	_	\|a UNRESTRICTED
981	_	_	\|a I:(DE-Juel1)VDB1346

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help