Generalizing Offline Reinforcement Learning to Unseen Dynamics Parameters with Synthetic Data

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Electrical Engineering | Master's thesis

Date

2024-12-25

Department

Major/Subject

Cloud and Network Infrastructures

Mcode

Degree programme

Master's Programme in ICT Innovation

Language

en

Pages

39

Series

Abstract

Reinforcement Learning (RL) has achieved remarkable performance in real-world industrial applications like robotics and logistics. RL often struggles with adapting to diverse and changing contexts due to limited training data and poor generalization capabilities. Collecting sufficient real-world data is both costly and time-consuming, which hampers the development of adaptable RL systems. To alleviate this problem, Context-Aware RL algorithms address these issues by incorporating contextual information. However, their ability to generalize to out-of-distribution (OOD) scenarios remains limited. On the other hand, diffusion models, known for their strong generative capabilities, offer a promising solution to enhance RL. In this thesis, we propose a method that leverages diffusion models to improve the sample efficiency and generalization ability of RL agents. We collect real data from online RL agents training and we train diffusion models on the real data with varying contexts. We use specific contexts to guide the diffusion model in generating transitions. We use these synthetic transitions to train offline RL agents, enabling them to perform effectively across diverse and unseen environments. Experimental results demonstrate that our method improves RL performance in OOD contexts while maintaining performance within in-distribution scenarios.

Description

Supervisor

Pajarinen, Joni

Thesis advisor

Scannell, Aidan
Shrestha, Jatan

Keywords

reinforcement learning, diffusion models, synthetic experience replay, sample-efficient learning, generalization, dynamic environments

Other note

Citation