001015018 001__ 1015018
001015018 005__ 20240116084320.0
001015018 0247_ $$2doi$$a10.1109/TNNLS.2023.3309735
001015018 0247_ $$2ISSN$$a2162-237X
001015018 0247_ $$2ISSN$$a2162-2388
001015018 0247_ $$2datacite_doi$$a10.34734/FZJ-2023-03545
001015018 0247_ $$2pmid$$a37721884
001015018 0247_ $$2WOS$$aWOS:001071988900001
001015018 037__ $$aFZJ-2023-03545
001015018 082__ $$a004
001015018 1001_ $$00000-0002-1858-9920$$aMoreno-Álvarez, Sergio$$b0
001015018 245__ $$aEnhancing Distributed Neural Network Training Through Node-Based Communications
001015018 260__ $$a[New York, NY]$$bIEEE$$c2023
001015018 3367_ $$2DRIVER$$aarticle
001015018 3367_ $$2DataCite$$aOutput Types/Journal article
001015018 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1701755190_15037
001015018 3367_ $$2BibTeX$$aARTICLE
001015018 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001015018 3367_ $$00$$2EndNote$$aJournal Article
001015018 520__ $$aThe amount of data needed to effectively train modern deep neural architectures has grown significantly, leading to increased computational requirements. These intensive computations are tackled by the combination of last generation computing resources, such as accelerators, or classic processing units. Nevertheless, gradient communication remains as the major bottleneck, hindering the efficiency notwithstanding the improvements in runtimes obtained through data parallelism strategies. Data parallelism involves all processes in a global exchange of potentially high amount of data, which may impede the achievement of the desired speedup and the elimination of noticeable delays or bottlenecks. As a result, communication latency issues pose a significant challenge that profoundly impacts the performance on distributed platforms. This research presents node-based optimization steps to significantly reduce the gradient exchange between model replicas whilst ensuring model convergence. The proposal serves as a versatile communication scheme, suitable for integration into a wide range of general-purpose deep neural network (DNN) algorithms. The optimization takes into consideration the specific location of each replica within the platform. To demonstrate the effectiveness, different neural network approaches and datasets with disjoint properties are used. In addition, multiple types of applications are considered to demonstrate the robustness and versatility of our proposal. The experimental results show a global training time reduction whilst slightly improving accuracy.
001015018 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001015018 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001015018 7001_ $$00000-0003-1030-3729$$aPaoletti, Mercedes E.$$b1
001015018 7001_ $$0P:(DE-Juel1)171343$$aCavallaro, Gabriele$$b2
001015018 7001_ $$00000-0001-6701-961X$$aHaut, Juan M.$$b3
001015018 773__ $$0PERI:(DE-600)2644189-5$$a10.1109/TNNLS.2023.3309735$$gp. 1 - 15$$p1 - 15$$tIEEE transactions on neural networks and learning systems$$v35$$x2162-237X$$y2023
001015018 8564_ $$uhttps://juser.fz-juelich.de/record/1015018/files/Sergio_Moreno_Alvarez_IEEE_TNNLS_2023.pdf$$yOpenAccess
001015018 909CO $$ooai:juser.fz-juelich.de:1015018$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
001015018 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)171343$$aForschungszentrum Jülich$$b2$$kFZJ
001015018 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001015018 9141_ $$y2023
001015018 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2022-11-19
001015018 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2022-11-19
001015018 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001015018 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)1160$$2StatID$$aDBCoverage$$bCurrent Contents - Engineering, Computing and Technology$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bIEEE T NEUR NET LEAR : 2022$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)0020$$2StatID$$aNo Peer Review$$bASC$$d2023-10-26
001015018 915__ $$0StatID:(DE-HGF)9910$$2StatID$$aIF >= 10$$bIEEE T NEUR NET LEAR : 2022$$d2023-10-26
001015018 920__ $$lyes
001015018 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001015018 980__ $$ajournal
001015018 980__ $$aVDB
001015018 980__ $$aUNRESTRICTED
001015018 980__ $$aI:(DE-Juel1)JSC-20090406
001015018 9801_ $$aFullTexts