Reinforcement Learning for Hydrobatic AUVs

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Sähkötekniikan korkeakoulu | Master's thesis

Date

2022-12-12

Department

Major/Subject

Autonomous Systems

Mcode

AUS

Degree programme

Master's Programme in ICT Innovation

Language

en

Pages

60+8

Series

Abstract

This master thesis focuses on developing a Reinforcement Learning (RL) controller to perform hydrobatic maneuvers on an Autonomous Underwater Vehicle (AUV) successfully. This work also aims to analyze the robustness of the RL controller, as well as provide a comparison between RL algorithms and Proportional Integral Derivative (PID) control. Training of the algorithms is initially conducted in the Numpy simulation in Python. We show how to model the Equations of Motion (EOM) of the AUV, and how to use it to train the RL controllers. We use the stablebaselines3 RL framework and create a training environment with the OpenAI gym. The Twin-Delay Deep Deterministic Policy Gradient (TD3) algorithm offers good performance in the simulation. The following maneuvers are studied: trim control, waypoint following, and an inverted pendulum. We test the maneuvers both in Numpy simulation and Stonefish simulator. Also, we test the robustness of the RL trim controller by simulating the noise in the state feedback. Lastly, we run the RL trim controller on the real SAM AUV hardware. We show that the RL algorithm trained in the Numpy simulator can achieve similar performance to the PID controller in the Stonefish simulator. We generate a policy that can perform the trim control and the Inverted Pendulum maneuver in the Numpy simulation. We show that we can generate a robust policy that executes other types of maneuvers by providing a parameterized cost function to the RL algorithm. We discuss the results of every maneuver we perform with the SAM AUV and provide a discussion about the advantages and disadvantages of this control method applied to underwater robotics. We conclude that RL can be used to create policies that perform hydrobatic maneuvers. This data-driven approach can be applied in the future to more complex problems in underwater robotics.

Description

Supervisor

Ögren, Petter

Thesis advisor

Bhat, Sriharsha

Keywords

deep reinforcement learning, deep learning, optimal control, hydrobatics

Other note

Citation