DROPO: Sim-to-real transfer with offline domain randomization
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.author | Tiboni, Gabriele | en_US |
dc.contributor.author | Arndt, Karol | en_US |
dc.contributor.author | Kyrki, Ville | en_US |
dc.contributor.department | Department of Electrical Engineering and Automation | en |
dc.contributor.groupauthor | Intelligent Robotics | en |
dc.contributor.organization | Department of Electrical Engineering and Automation | en_US |
dc.date.accessioned | 2023-06-14T08:51:31Z | |
dc.date.available | 2023-06-14T08:51:31Z | |
dc.date.issued | 2023-08 | en_US |
dc.description | Funding Information: This work was supported by Academy of Finland grants 317020 and 328399 . We acknowledge the computational resources generously provided by CSC – IT Center for Science, Finland, and by the Aalto Science-IT project. Publisher Copyright: © 2023 The Author(s) | |
dc.description.abstract | In recent years, domain randomization over dynamics parameters has gained a lot of traction as a method for sim-to-real transfer of reinforcement learning policies in robotic manipulation; however, finding optimal randomization distributions can be difficult. In this paper, we introduce DROPO, a novel method for estimating domain randomization distributions for safe sim-to-real transfer. Unlike prior work, DROPO only requires a limited, precollected offline dataset of trajectories, and explicitly models parameter uncertainty to match real data using a likelihood-based approach. We demonstrate that DROPO is capable of recovering dynamic parameter distributions in simulation and finding a distribution capable of compensating for an unmodeled phenomenon. We also evaluate the method in two zero-shot sim-to-real transfer scenarios, showing successful domain transfer and improved performance over prior methods. | en |
dc.description.version | Peer reviewed | en |
dc.format.extent | 15 | |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Tiboni, G, Arndt, K & Kyrki, V 2023, ' DROPO: Sim-to-real transfer with offline domain randomization ', Robotics and Autonomous Systems, vol. 166, 104432 . https://doi.org/10.1016/j.robot.2023.104432 | en |
dc.identifier.doi | 10.1016/j.robot.2023.104432 | en_US |
dc.identifier.issn | 0921-8890 | |
dc.identifier.other | PURE UUID: 607d09d3-16ec-4130-bd13-65ec37a2da86 | en_US |
dc.identifier.other | PURE ITEMURL: https://research.aalto.fi/en/publications/607d09d3-16ec-4130-bd13-65ec37a2da86 | en_US |
dc.identifier.other | PURE LINK: http://www.scopus.com/inward/record.url?scp=85160508715&partnerID=8YFLogxK | en_US |
dc.identifier.other | PURE FILEURL: https://research.aalto.fi/files/113445706/1_s2.0_S0921889023000714_main.pdf | en_US |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/121448 | |
dc.identifier.urn | URN:NBN:fi:aalto-202306143825 | |
dc.language.iso | en | en |
dc.publisher | Elsevier Science | |
dc.relation.ispartofseries | Robotics and Autonomous Systems | en |
dc.relation.ispartofseries | Volume 166 | en |
dc.rights | openAccess | en |
dc.subject.keyword | Domain randomization | en_US |
dc.subject.keyword | Reinforcement learning | en_US |
dc.subject.keyword | Robot learning | en_US |
dc.subject.keyword | Transfer learning | en_US |
dc.title | DROPO: Sim-to-real transfer with offline domain randomization | en |
dc.type | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä | fi |
dc.type.version | publishedVersion |