Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

15

Series

Proceedings of Machine Learning Research, Volume 242, pp. 1107-1121

Abstract

Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions.

Description

Publisher Copyright: © 2024 M. Nakhaei1, A. Scannell1,2 & J. Pajarinen1.

Other note

Citation

Nakhaeinezhadfard, M, Scannell, A & Pajarinen, J 2024, 'Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning', Proceedings of Machine Learning Research, vol. 242, pp. 1107-1121. < https://proceedings.mlr.press/v242/nakhaeinezhadfard24a.html >