SafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorKaushik, Riturajen_US
dc.contributor.authorArndt, Karolen_US
dc.contributor.authorKyrki, Villeen_US
dc.contributor.departmentDepartment of Electrical Engineering and Automationen
dc.contributor.groupauthorIntelligent Roboticsen
dc.date.accessioned2022-08-10T08:24:13Z
dc.date.available2022-08-10T08:24:13Z
dc.date.issued2022-07-01en_US
dc.descriptionPublisher Copyright: © 2016 IEEE.
dc.description.abstractThe framework of sim-to-real learning, i.e., training policies in simulation and transferring them to real-world systems, is one of the most promising approaches towards data-efficient learning in robotics. However, due to the inevitable reality gap between the simulation and the real world, a policy learned in the simulation may not always generate a safe behaviour on the real robot. As a result, during policy adaptation in the real world, the robot may damage itself or cause harm to its surroundings. In this work, we introduce SafeAPT, a multi-goal robot learning algorithm that leverages a diverse repertoire of policies evolved in simulation and transfers the most promising safe policy to the real robot through episodic interaction. To achieve this, SafeAPT iteratively learns probabilistic reward and safety models from real-world observations using simulated experiences as priors. Then, it performs Bayesian optimization to select the best policy from the repertoire with the reward model, while maintaining the specified safety constraint using the safety model. SafeAPT allows a robot to adapt to a wide range of goals safely with the same repertoire of policies evolved in the simulation. We compare SafeAPT with several baselines, both in simulated and real robotic experiments, and show that SafeAPT finds high-performing policies within a few minutes of real-world operation while minimizing safety violations during the interactions.en
dc.description.versionPeer revieweden
dc.format.extent8
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationKaushik, R, Arndt, K & Kyrki, V 2022, 'SafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation', IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6838-6845. https://doi.org/10.1109/LRA.2022.3177294en
dc.identifier.doi10.1109/LRA.2022.3177294en_US
dc.identifier.issn2377-3766
dc.identifier.issn2377-3774
dc.identifier.otherPURE UUID: aebd07e0-ed1b-4218-b353-88f462d1bf3een_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/aebd07e0-ed1b-4218-b353-88f462d1bf3een_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/85707006/SafeAPT_Safe_Simulation_to_Real_Robot_Learning_Using_Diverse_Policies_Learned_in_Simulation.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/115881
dc.identifier.urnURN:NBN:fi:aalto-202208104703
dc.language.isoenen
dc.publisherIEEE
dc.relation.ispartofseriesIEEE Robotics and Automation Lettersen
dc.relation.ispartofseriesVolume 7, issue 3, pp. 6838-6845en
dc.rightsopenAccessen
dc.subject.keywordEvolutionary roboticsen_US
dc.subject.keywordlearning from experienceen_US
dc.subject.keywordmachine learning for robot controlen_US
dc.titleSafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulationen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SafeAPT_Safe_Simulation_to_Real_Robot_Learning_Using_Diverse_Policies_Learned_in_Simulation.pdf
Size:
1.78 MB
Format:
Adobe Portable Document Format