Title: | Safe and efficient transfer of robot policies from simulation to the real world |
Author(s): | Arndt, Karol |
Date: | 2023 |
Language: | en |
Pages: | 100 + app.110 |
Department: | Sähkötekniikan ja automaation laitos Department of Electrical Engineering and Automation |
ISBN: | 978-952-64-1233-7 (electronic) 978-952-64-1232-0 (printed) |
Series: | Aalto University publication series DOCTORAL THESES, 55/2023 |
ISSN: | 1799-4942 (electronic) 1799-4934 (printed) 1799-4934 (ISSN-L) |
Supervising professor(s): | Kyrki, Ville, Prof., Aalto University, Department of Electrical Engineering and Automation, Finland |
Subject: | Electrical engineering |
Keywords: | robotics, machine learning, reinforcement learning |
Archive | yes |
|
|
Abstract:The past decade has witnessed enormous progress in reinforcement learning, with intelligent agents learning to perform a variety of different tasks, including locomotion, imitating human behavior, and even outperforming human experts in a range of board games and video games of various complexity, such as Pong, Go, or Dota 2. However, all these tasks share one common characteristic: they are all either performed entirely in simulation, or are based on simple rules that can be perfectly modeled in software. Furthermore, current reinforcement learning approaches that perform well in virtual environments cannot be directly applied to physical agents operating in the real world, such as robots, due to their reliance on massive data collection. As such, the training process not only takes a long time, resulting in hardware depreciation, but often involves a safety risk associated with active exploration: the agent must evaluate a large number of possible actions in order to decide on the best one, some of which can lead to catastrophic outcomes.
|
|
Parts:[Publication 1]: Aleksi Hämäläinen, Karol Arndt, Ali Ghadirzadeh and Ville Kyrki. Affordance Learning for End-to-end Visuomotor Control. In International Conference on Intelligent Robots and Systems (IROS), Macau, China. pp. 1781–1788, November 2019[Publication 2]: Karol Arndt, Murtaza Hazara, Ali Ghadirzadeh and Ville Kyrki. Meta reinforcement learning for sim-to-real domain adaptation. In International Conference on Robotics and Automation (ICRA), Paris, France. pp. 2725–2731, May 2020. DOI: 10.1109/ICRA40945.2020.9196540 View at Publisher [Publication 3]: Karol Arndt, Ali Ghadirzadeh, Murtaza Hazara and Ville Kyrki. Fewshot model-based adaptation in noisy conditions. Robotics and Automation Letters (RA-L), vol. 6, issue 2, pp. 4193–4200, April 2021. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202105056514. DOI: 10.1109/LRA.2021.3068104 View at Publisher [Publication 4]: Karol Arndt, Oliver Struckmeier and Ville Kyrki. Domain Curiosity: Learning Efficient Data Collection Strategies for Domain Adaptation. In International Conference on Intelligent Robots and Systems (IROS), Prague, Czechia. pp. 1259–1266, October 2021. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202202091809. DOI: 10.1109/IROS51168.2021.9635864 View at Publisher [Publication 5]: Rituraj Kaushik, Karol Arndt and Ville Kyrki. SafeAPT: Safe Simulationto-Real Robot Learning using Diverse Policies Learned in Simulation. Robotics and Automation Letters (RA-L), vol. 7, issue 3, pp. 6838–6845, July 2022. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202208104703. DOI: 10.1109/LRA.2022.3177294 View at Publisher [Publication 6]: Gabriele Tiboni, Karol Arndt and Ville Kyrki. DROPO: Sim-to-Real Transfer with Offline Domain Randomization. Submitted for publication, June 2022[Publication 7]: Ali Ghadirzadeh, Petra Poklukar, Karol Arndt, Chelsea Finn, Ville Kyrki, Danica Kragic and Mårten Björkman. Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models. Journal of Machine Learning Research (JMLR), vol. 23 (174), pp. 1–37, June 2022. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202208174921. |
|
|
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Page content by: Aalto University Learning Centre | Privacy policy of the service | About this site