Mode-constrained Model-based Reinforcement Learning via Gaussian Processes
Loading...
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2023
Major/Subject
Mcode
Degree programme
Language
en
Pages
16
3299-3314
3299-3314
Series
Proceedings of Machine Learning Research, Volume 206
Abstract
Model-based reinforcement learning (RL) algorithms do not typically consider environments with multiple dynamic modes, where it is beneficial to avoid inoperable or undesirable modes. We present a model-based RL algorithm that constrains training to a single dynamic mode with high probability. This is a difficult problem because the mode constraint is a hidden variable associated with the environment's dynamics. As such, it is 1) unknown a priori and 2) we do not observe its output from the environment, so cannot learn it with supervised learning. We present a nonparametric dynamic model which learns the mode constraint alongside the dynamic modes. Importantly, it learns latent structure that our planning scheme leverages to 1) enforce the mode constraint with high probability, and 2) escape local optima induced by the mode constraint. We validate our method by showing that it can solve a simulated quadcopter navigation task whilst providing a level of constraint satisfaction both during and after training.Description
Funding Information: We thank ST John, Martin Trapp, Arno Solin, and Paul Chang for valuable discussions and feedback. This work was conducted whilst Aidan Scannell was a PhD student at the EPSRC Centre for Doctoral Training in Future Autonomous and Robotic Systems (FARSCOPE) at the Bristol Robotics Laboratory. It was finished whilst funded by the Finnish Center for Artificial Intelligence (FCAI). Publisher Copyright: Copyright © 2023 by the author(s)
Keywords
Other note
Citation
Scannell, A, Ek, C H & Richards, A 2023, ' Mode-constrained Model-based Reinforcement Learning via Gaussian Processes ', Proceedings of Machine Learning Research, vol. 206, pp. 3299-3314 . < https://proceedings.mlr.press/v206/scannell23a/scannell23a.pdf >