Safe Reinforcement Learning for Real Robots
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Sähkötekniikan korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2023-12-11
Department
Major/Subject
Autonomous Systems
Mcode
ELEC3055
Degree programme
Master's Programme in ICT Innovation
Language
en
Pages
69
Series
Abstract
This thesis explores the field of Safe Reinforcement Learning (SRL), a subset of reinforcement learning that emphasizes the safety of the agent during the learning process, focusing on its application in robotics implementing Trust Region Conditional Value at Risk (TRC) algorithm for SRL. The primary objectives are to teach an SRL model to navigate safely in a complex environment and to effectively bridge the sim-to-real gap, allowing for a smooth transfer from computer simulations to real-world environments. The main challenge in SRL is ensuring the agent’s safety throughout the learning process, which requires maintaining optimal performance despite the uncertainties and dynamic variables present in real-world environments. For the simulated training, the SafetyGym simulator was used, which is built on the MuJoCo physics engine. When it came to real-world tests, the Robot Operating System (ROS) was the chosen platform, using TurtleBot 2i, a versatile mobile robot platform equipped with a range of sensors, including the SICK TIM551 LiDAR, which has the capability to accurately measure distances for perception purposes. Different methods were explored to address the objectives, with Domain Randomization (DR) emerging as the top choice, a technique that involves randomizing the parameters of the simulation environment during training to help the model generalize better to the real-world. Interestingly, while the model without DR learned three times faster in simulations, it struggled in real-world tests. In the toughest test, it did not succeed even once. In contrast, the model trained with domain randomization passed every time. This model was further refined with real-world training, showing significant improvement in challenging situations. Ultimately, this research highlights the value of DR in ensuring that robots can use what they learn in simulations in the real-world, especially in situations where safety is crucial.Description
Supervisor
Pajarinen, JoniThesis advisor
Terra, AhmadHata, Alberto
Keywords
artificial intelligence, robotics, reinforcement learning, learning, domain randomization, autonomous systems