Reinforcement Learning for Hydrobatic AUVs
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Sähkötekniikan korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2022-12-12
Department
Major/Subject
Autonomous Systems
Mcode
AUS
Degree programme
Master's Programme in ICT Innovation
Language
en
Pages
60+8
Series
Abstract
This master thesis focuses on developing a Reinforcement Learning (RL) controller to perform hydrobatic maneuvers on an Autonomous Underwater Vehicle (AUV) successfully. This work also aims to analyze the robustness of the RL controller, as well as provide a comparison between RL algorithms and Proportional Integral Derivative (PID) control. Training of the algorithms is initially conducted in the Numpy simulation in Python. We show how to model the Equations of Motion (EOM) of the AUV, and how to use it to train the RL controllers. We use the stablebaselines3 RL framework and create a training environment with the OpenAI gym. The Twin-Delay Deep Deterministic Policy Gradient (TD3) algorithm offers good performance in the simulation. The following maneuvers are studied: trim control, waypoint following, and an inverted pendulum. We test the maneuvers both in Numpy simulation and Stonefish simulator. Also, we test the robustness of the RL trim controller by simulating the noise in the state feedback. Lastly, we run the RL trim controller on the real SAM AUV hardware. We show that the RL algorithm trained in the Numpy simulator can achieve similar performance to the PID controller in the Stonefish simulator. We generate a policy that can perform the trim control and the Inverted Pendulum maneuver in the Numpy simulation. We show that we can generate a robust policy that executes other types of maneuvers by providing a parameterized cost function to the RL algorithm. We discuss the results of every maneuver we perform with the SAM AUV and provide a discussion about the advantages and disadvantages of this control method applied to underwater robotics. We conclude that RL can be used to create policies that perform hydrobatic maneuvers. This data-driven approach can be applied in the future to more complex problems in underwater robotics.Description
Supervisor
Ögren, PetterThesis advisor
Bhat, SriharshaKeywords
deep reinforcement learning, deep learning, optimal control, hydrobatics