001048916 001__ 1048916
001048916 005__ 20251219202230.0
001048916 020__ $$a978-3-032-07611-3 (print)
001048916 020__ $$a978-3-032-07612-0 (electronic)
001048916 0247_ $$2doi$$a10.1007/978-3-032-07612-0_12
001048916 0247_ $$2ISSN$$a0302-9743
001048916 0247_ $$2ISSN$$a1611-3349
001048916 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-05015
001048916 037__ $$aFZJ-2025-05015
001048916 041__ $$aEnglish
001048916 1001_ $$0P:(DE-Juel1)180916$$aAach, Marcel$$b0$$eCorresponding author$$ufzj
001048916 1112_ $$aISC High Performance 2025$$cHamburg$$d2025-06-10 - 2025-06-13$$gISC 2025$$wGermany
001048916 245__ $$aOptimizing Edge AI Models on HPC Systems with the Edge in the Loop
001048916 260__ $$aCham$$bSpringer Nature Switzerland$$c2026
001048916 29510 $$aHigh Performance Computing
001048916 300__ $$a148 - 161
001048916 3367_ $$2ORCID$$aCONFERENCE_PAPER
001048916 3367_ $$033$$2EndNote$$aConference Paper
001048916 3367_ $$2BibTeX$$aINPROCEEDINGS
001048916 3367_ $$2DRIVER$$aconferenceObject
001048916 3367_ $$2DataCite$$aOutput Types/Conference Paper
001048916 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1766155760_16742
001048916 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
001048916 4900_ $$aLecture Notes in Computer Science$$v16091
001048916 520__ $$aArtificial Intelligence (AI) and Machine Learning (ML) models deployed on edge devices, e.g., for quality control in Additive Manufacturing (AM), are frequently small in size. Such models usually have to deliver highly accurate results within a short time frame. Methodsthat are commonly employed in literature start out with larger trained models and try to reduce their memory and latency footprint by structural pruning, knowledge distillation, or quantization. It is, however, also possible to leverage hardware-aware Neural Architecture Search (NAS), an approach that seeks to systematically explore the architecture space to find optimized configurations. In this study, a hardware-aware NAS workflow is introduced that couples an edge device located in Belgium with a powerful High-Performance Computing (HPC) system in Germany, to train possible architecture candidates as fast as possible while performing real-time latency measurements on the target hardware. The approach is verified on a use case in the AM domain, based on the open RAISE-LPBF dataset, achieving ≈ 8.8 times faster inference speed while simultaneously enhancing model quality by a factor of ≈ 1.35, compared to a human-designed baseline.
001048916 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001048916 536__ $$0G:(EU-Grant)951733$$aRAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733)$$c951733$$fH2020-INFRAEDI-2019-1$$x1
001048916 588__ $$aDataset connected to CrossRef Book Series, Journals: juser.fz-juelich.de
001048916 7001_ $$00000-0003-3271-2398$$aBlanc, Cyril$$b1
001048916 7001_ $$0P:(DE-Juel1)165948$$aLintermann, Andreas$$b2
001048916 7001_ $$00000-0001-9116-6986$$aDe Grave, Kurt$$b3
001048916 770__ $$z978-3-032-07611-3
001048916 773__ $$a10.1007/978-3-032-07612-0_12
001048916 8564_ $$uhttps://juser.fz-juelich.de/record/1048916/files/2505.19995v1.pdf$$yOpenAccess
001048916 909CO $$ooai:juser.fz-juelich.de:1048916$$pec_fundedresources$$pVDB$$pdriver$$pdnbdelivery$$popen_access$$popenaire
001048916 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180916$$aForschungszentrum Jülich$$b0$$kFZJ
001048916 9101_ $$0I:(DE-HGF)0$$60000-0003-3271-2398$$aExternal Institute$$b1$$kExtern
001048916 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)165948$$aForschungszentrum Jülich$$b2$$kFZJ
001048916 9101_ $$0I:(DE-HGF)0$$60000-0001-9116-6986$$aExternal Institute$$b3$$kExtern
001048916 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001048916 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2024-12-28
001048916 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001048916 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2024-12-28$$wger
001048916 920__ $$lyes
001048916 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001048916 980__ $$acontrib
001048916 980__ $$aVDB
001048916 980__ $$aUNRESTRICTED
001048916 980__ $$acontb
001048916 980__ $$aI:(DE-Juel1)JSC-20090406
001048916 9801_ $$aFullTexts