Simplified Temporal Consistency Reinforcement Learning
Loading...
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2023-07
Major/Subject
Mcode
Degree programme
Language
en
Pages
20
42227-42246
42227-42246
Series
Proceedings of the 40th International Conference on Machine Learning, Proceedings of Machine Learning Research, Volume 202
Abstract
Reinforcement learning (RL) is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this paper we show that, surprisingly, a simple representation learning approach relying only on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. This applies when using pure planning with a dynamics model conditioned on the representation, but, also when utilizing the representation as policy and value function features in model-free RL. In experiments, our approach learns an accurate dynamics model to solve challenging high-dimensional locomotion tasks with online planners while being 4.1× faster to train compared to ensemble-based methods. With model-free RL without planning, especially on high-dimensional tasks, such as the Deepmind Control Suite Humanoid and Dog tasks, our approach outperforms model-free methods by a large margin and matches model-based methods’ sample efficiency while training 2.4× faster.Description
Keywords
Citation
Zhao , Y , Zhao , W , Boney , R , Kannala , J & Pajarinen , J 2023 , Simplified Temporal Consistency Reinforcement Learning . in A Krause , E Brunskill , K Cho , B Engelhardt , S Sabato & J Scarlett (eds) , Proceedings of the 40th International Conference on Machine Learning . Proceedings of Machine Learning Research , vol. 202 , JMLR , pp. 42227-42246 , International Conference on Machine Learning , Honolulu , Hawaii , United States , 23/07/2023 . < https://proceedings.mlr.press/v202/zhao23k.html >