001     1015018
005     20240116084320.0
024 7 _ |a 10.1109/TNNLS.2023.3309735
|2 doi
024 7 _ |a 2162-237X
|2 ISSN
024 7 _ |a 2162-2388
|2 ISSN
024 7 _ |a 10.34734/FZJ-2023-03545
|2 datacite_doi
024 7 _ |a 37721884
|2 pmid
024 7 _ |a WOS:001071988900001
|2 WOS
037 _ _ |a FZJ-2023-03545
082 _ _ |a 004
100 1 _ |a Moreno-Álvarez, Sergio
|0 0000-0002-1858-9920
|b 0
245 _ _ |a Enhancing Distributed Neural Network Training Through Node-Based Communications
260 _ _ |a [New York, NY]
|c 2023
|b IEEE
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1701755190_15037
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a The amount of data needed to effectively train modern deep neural architectures has grown significantly, leading to increased computational requirements. These intensive computations are tackled by the combination of last generation computing resources, such as accelerators, or classic processing units. Nevertheless, gradient communication remains as the major bottleneck, hindering the efficiency notwithstanding the improvements in runtimes obtained through data parallelism strategies. Data parallelism involves all processes in a global exchange of potentially high amount of data, which may impede the achievement of the desired speedup and the elimination of noticeable delays or bottlenecks. As a result, communication latency issues pose a significant challenge that profoundly impacts the performance on distributed platforms. This research presents node-based optimization steps to significantly reduce the gradient exchange between model replicas whilst ensuring model convergence. The proposal serves as a versatile communication scheme, suitable for integration into a wide range of general-purpose deep neural network (DNN) algorithms. The optimization takes into consideration the specific location of each replica within the platform. To demonstrate the effectiveness, different neural network approaches and datasets with disjoint properties are used. In addition, multiple types of applications are considered to demonstrate the robustness and versatility of our proposal. The experimental results show a global training time reduction whilst slightly improving accuracy.
536 _ _ |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5111
|c POF4-511
|f POF IV
|x 0
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |a Paoletti, Mercedes E.
|0 0000-0003-1030-3729
|b 1
700 1 _ |a Cavallaro, Gabriele
|0 P:(DE-Juel1)171343
|b 2
700 1 _ |a Haut, Juan M.
|0 0000-0001-6701-961X
|b 3
773 _ _ |a 10.1109/TNNLS.2023.3309735
|g p. 1 - 15
|0 PERI:(DE-600)2644189-5
|p 1 - 15
|t IEEE transactions on neural networks and learning systems
|v 35
|y 2023
|x 2162-237X
856 4 _ |u https://juser.fz-juelich.de/record/1015018/files/Sergio_Moreno_Alvarez_IEEE_TNNLS_2023.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1015018
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)171343
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5111
|x 0
914 1 _ |y 2023
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2022-11-19
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2022-11-19
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1160
|2 StatID
|b Current Contents - Engineering, Computing and Technology
|d 2023-10-26
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b IEEE T NEUR NET LEAR : 2022
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
|d 2023-10-26
915 _ _ |a No Peer Review
|0 StatID:(DE-HGF)0020
|2 StatID
|b ASC
|d 2023-10-26
915 _ _ |a IF >= 10
|0 StatID:(DE-HGF)9910
|2 StatID
|b IEEE T NEUR NET LEAR : 2022
|d 2023-10-26
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21