Safe Reinforcement Learning for Real Robots

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Sähkötekniikan korkeakoulu | Master's thesis

Date

2023-12-11

Department

Major/Subject

Autonomous Systems

Mcode

ELEC3055

Degree programme

Master's Programme in ICT Innovation

Language

en

Pages

69

Series

Abstract

This thesis explores the field of Safe Reinforcement Learning (SRL), a subset of reinforcement learning that emphasizes the safety of the agent during the learning process, focusing on its application in robotics implementing Trust Region Conditional Value at Risk (TRC) algorithm for SRL. The primary objectives are to teach an SRL model to navigate safely in a complex environment and to effectively bridge the sim-to-real gap, allowing for a smooth transfer from computer simulations to real-world environments. The main challenge in SRL is ensuring the agent’s safety throughout the learning process, which requires maintaining optimal performance despite the uncertainties and dynamic variables present in real-world environments. For the simulated training, the SafetyGym simulator was used, which is built on the MuJoCo physics engine. When it came to real-world tests, the Robot Operating System (ROS) was the chosen platform, using TurtleBot 2i, a versatile mobile robot platform equipped with a range of sensors, including the SICK TIM551 LiDAR, which has the capability to accurately measure distances for perception purposes. Different methods were explored to address the objectives, with Domain Randomization (DR) emerging as the top choice, a technique that involves randomizing the parameters of the simulation environment during training to help the model generalize better to the real-world. Interestingly, while the model without DR learned three times faster in simulations, it struggled in real-world tests. In the toughest test, it did not succeed even once. In contrast, the model trained with domain randomization passed every time. This model was further refined with real-world training, showing significant improvement in challenging situations. Ultimately, this research highlights the value of DR in ensuring that robots can use what they learn in simulations in the real-world, especially in situations where safety is crucial.

Description

Supervisor

Pajarinen, Joni

Thesis advisor

Terra, Ahmad
Hata, Alberto

Keywords

artificial intelligence, robotics, reinforcement learning, learning, domain randomization, autonomous systems

Other note

Citation