Hauptseite > Publikationsdatenbank > End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control > print |
001 | 1021653 | ||
005 | 20240712112903.0 | ||
024 | 7 | _ | |a 10.48550/ARXIV.2308.01674 |2 doi |
024 | 7 | _ | |a 10.34734/FZJ-2024-00909 |2 datacite_doi |
037 | _ | _ | |a FZJ-2024-00909 |
100 | 1 | _ | |a Mayfrank, Daniel |0 P:(DE-Juel1)192151 |b 0 |u fzj |
245 | _ | _ | |a End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control |
260 | _ | _ | |c 2023 |b arXiv |
336 | 7 | _ | |a Preprint |b preprint |m preprint |0 PUB:(DE-HGF)25 |s 1706103588_7712 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a WORKING_PAPER |2 ORCID |
336 | 7 | _ | |a Electronic Article |0 28 |2 EndNote |
336 | 7 | _ | |a preprint |2 DRIVER |
336 | 7 | _ | |a ARTICLE |2 BibTeX |
336 | 7 | _ | |a Output Types/Working Paper |2 DataCite |
520 | _ | _ | |a (Economic) nonlinear model predictive control ((e)NMPC) requires dynamic system models that are sufficiently accurate in all relevant state-space regions. These models must also be computationally cheap enough to ensure real-time tractability. Data-driven surrogate models for mechanistic models can be used to reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum average prediction accuracy on simulation samples and perform suboptimally as part of actual (e)NMPC. We present a method for end-to-end reinforcement learning of dynamic surrogate models for optimal performance in (e)NMPC applications, resulting in predictive controllers that strike a favorable balance between control performance and computational demand. We validate our method on two applications derived from an established nonlinear continuous stirred-tank reactor model. We compare the controller performance to that of MPCs utilizing models trained by the prevailing maximum prediction accuracy paradigm, and model-free neural network controllers trained using reinforcement learning. We show that our method matches the performance of the model-free neural network controllers while consistently outperforming models derived from system identification. Additionally, we show that the MPC policies can react to changes in the control setting without retraining. |
536 | _ | _ | |a 1121 - Digitalization and Systems Technology for Flexibility Solutions (POF4-112) |0 G:(DE-HGF)POF4-1121 |c POF4-112 |f POF IV |x 0 |
536 | _ | _ | |a HDS LEE - Helmholtz School for Data Science in Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612) |0 G:(DE-Juel1)HDS-LEE-20190612 |c HDS-LEE-20190612 |x 1 |
588 | _ | _ | |a Dataset connected to DataCite |
650 | _ | 7 | |a Machine Learning (cs.LG) |2 Other |
650 | _ | 7 | |a Systems and Control (eess.SY) |2 Other |
650 | _ | 7 | |a FOS: Computer and information sciences |2 Other |
650 | _ | 7 | |a FOS: Electrical engineering, electronic engineering, information engineering |2 Other |
700 | 1 | _ | |a Mitsos, Alexander |0 P:(DE-Juel1)172025 |b 1 |u fzj |
700 | 1 | _ | |a Dahmen, Manuel |0 P:(DE-Juel1)172097 |b 2 |e Corresponding author |u fzj |
773 | _ | _ | |a 10.48550/ARXIV.2308.01674 |
856 | 4 | _ | |y OpenAccess |u https://juser.fz-juelich.de/record/1021653/files/2308.01674.pdf |
856 | 4 | _ | |y OpenAccess |x icon |u https://juser.fz-juelich.de/record/1021653/files/2308.01674.gif?subformat=icon |
856 | 4 | _ | |y OpenAccess |x icon-1440 |u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-1440 |
856 | 4 | _ | |y OpenAccess |x icon-180 |u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-180 |
856 | 4 | _ | |y OpenAccess |x icon-640 |u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-640 |
909 | C | O | |o oai:juser.fz-juelich.de:1021653 |p openaire |p open_access |p VDB |p driver |p dnbdelivery |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)192151 |
910 | 1 | _ | |a RWTH Aachen |0 I:(DE-588b)36225-6 |k RWTH |b 0 |6 P:(DE-Juel1)192151 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)172025 |
910 | 1 | _ | |a RWTH Aachen |0 I:(DE-588b)36225-6 |k RWTH |b 1 |6 P:(DE-Juel1)172025 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 2 |6 P:(DE-Juel1)172097 |
913 | 1 | _ | |a DE-HGF |b Forschungsbereich Energie |l Energiesystemdesign (ESD) |1 G:(DE-HGF)POF4-110 |0 G:(DE-HGF)POF4-112 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-100 |4 G:(DE-HGF)POF |v Digitalisierung und Systemtechnik |9 G:(DE-HGF)POF4-1121 |x 0 |
914 | 1 | _ | |y 2023 |
915 | _ | _ | |a OpenAccess |0 StatID:(DE-HGF)0510 |2 StatID |
920 | _ | _ | |l yes |
920 | 1 | _ | |0 I:(DE-Juel1)IEK-10-20170217 |k IEK-10 |l Modellierung von Energiesystemen |x 0 |
980 | 1 | _ | |a FullTexts |
980 | _ | _ | |a preprint |
980 | _ | _ | |a VDB |
980 | _ | _ | |a UNRESTRICTED |
980 | _ | _ | |a I:(DE-Juel1)IEK-10-20170217 |
981 | _ | _ | |a I:(DE-Juel1)ICE-1-20170217 |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|