Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
Major/Subject
Mcode
Degree programme
Language
en
Pages
15
Series
Proceedings of Machine Learning Research, Volume 242, pp. 1107-1121
Abstract
Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions.Description
Publisher Copyright: © 2024 M. Nakhaei1, A. Scannell1,2 & J. Pajarinen1.
Keywords
Other note
Citation
Nakhaeinezhadfard, M, Scannell, A & Pajarinen, J 2024, 'Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning', Proceedings of Machine Learning Research, vol. 242, pp. 1107-1121. < https://proceedings.mlr.press/v242/nakhaeinezhadfard24a.html >