End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control

Mayfrank, Daniel; Dahmen, Manuel; Mitsos, Alexander

doi:10.48550/ARXIV.2308.01674

Items
Marc 21

001			1021653
005			20240712112903.0
024	7	_	\|a 10.48550/ARXIV.2308.01674 \|2 doi
024	7	_	\|a 10.34734/FZJ-2024-00909 \|2 datacite_doi
037	_	_	\|a FZJ-2024-00909
100	1	_	\|a Mayfrank, Daniel \|0 P:(DE-Juel1)192151 \|b 0 \|u fzj
245	_	_	\|a End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control
260	_	_	\|c 2023 \|b arXiv
336	7	_	\|a Preprint \|b preprint \|m preprint \|0 PUB:(DE-HGF)25 \|s 1706103588_7712 \|2 PUB:(DE-HGF)
336	7	_	\|a WORKING_PAPER \|2 ORCID
336	7	_	\|a Electronic Article \|0 28 \|2 EndNote
336	7	_	\|a preprint \|2 DRIVER
336	7	_	\|a ARTICLE \|2 BibTeX
336	7	_	\|a Output Types/Working Paper \|2 DataCite
520	_	_	\|a (Economic) nonlinear model predictive control ((e)NMPC) requires dynamic system models that are sufficiently accurate in all relevant state-space regions. These models must also be computationally cheap enough to ensure real-time tractability. Data-driven surrogate models for mechanistic models can be used to reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum average prediction accuracy on simulation samples and perform suboptimally as part of actual (e)NMPC. We present a method for end-to-end reinforcement learning of dynamic surrogate models for optimal performance in (e)NMPC applications, resulting in predictive controllers that strike a favorable balance between control performance and computational demand. We validate our method on two applications derived from an established nonlinear continuous stirred-tank reactor model. We compare the controller performance to that of MPCs utilizing models trained by the prevailing maximum prediction accuracy paradigm, and model-free neural network controllers trained using reinforcement learning. We show that our method matches the performance of the model-free neural network controllers while consistently outperforming models derived from system identification. Additionally, we show that the MPC policies can react to changes in the control setting without retraining.
536	_	_	\|a 1121 - Digitalization and Systems Technology for Flexibility Solutions (POF4-112) \|0 G:(DE-HGF)POF4-1121 \|c POF4-112 \|f POF IV \|x 0
536	_	_	\|a HDS LEE - Helmholtz School for Data Science in Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612) \|0 G:(DE-Juel1)HDS-LEE-20190612 \|c HDS-LEE-20190612 \|x 1
588	_	_	\|a Dataset connected to DataCite
650	_	7	\|a Machine Learning (cs.LG) \|2 Other
650	_	7	\|a Systems and Control (eess.SY) \|2 Other
650	_	7	\|a FOS: Computer and information sciences \|2 Other
650	_	7	\|a FOS: Electrical engineering, electronic engineering, information engineering \|2 Other
700	1	_	\|a Mitsos, Alexander \|0 P:(DE-Juel1)172025 \|b 1 \|u fzj
700	1	_	\|a Dahmen, Manuel \|0 P:(DE-Juel1)172097 \|b 2 \|e Corresponding author \|u fzj
773	_	_	\|a 10.48550/ARXIV.2308.01674
856	4	_	\|y OpenAccess \|u https://juser.fz-juelich.de/record/1021653/files/2308.01674.pdf
856	4	_	\|y OpenAccess \|x icon \|u https://juser.fz-juelich.de/record/1021653/files/2308.01674.gif?subformat=icon
856	4	_	\|y OpenAccess \|x icon-1440 \|u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-1440
856	4	_	\|y OpenAccess \|x icon-180 \|u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-180
856	4	_	\|y OpenAccess \|x icon-640 \|u https://juser.fz-juelich.de/record/1021653/files/2308.01674.jpg?subformat=icon-640
909	C	O	\|o oai:juser.fz-juelich.de:1021653 \|p openaire \|p open_access \|p VDB \|p driver \|p dnbdelivery
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)192151
910	1	_	\|a RWTH Aachen \|0 I:(DE-588b)36225-6 \|k RWTH \|b 0 \|6 P:(DE-Juel1)192151
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)172025
910	1	_	\|a RWTH Aachen \|0 I:(DE-588b)36225-6 \|k RWTH \|b 1 \|6 P:(DE-Juel1)172025
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 2 \|6 P:(DE-Juel1)172097
913	1	_	\|a DE-HGF \|b Forschungsbereich Energie \|l Energiesystemdesign (ESD) \|1 G:(DE-HGF)POF4-110 \|0 G:(DE-HGF)POF4-112 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-100 \|4 G:(DE-HGF)POF \|v Digitalisierung und Systemtechnik \|9 G:(DE-HGF)POF4-1121 \|x 0
914	1	_	\|y 2023
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)IEK-10-20170217 \|k IEK-10 \|l Modellierung von Energiesystemen \|x 0
980	1	_	\|a FullTexts
980	_	_	\|a preprint
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a I:(DE-Juel1)IEK-10-20170217
981	_	_	\|a I:(DE-Juel1)ICE-1-20170217

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe