Simulation-Aided Policy Tuning for Black-Box Robot Learning
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.author | He, Shiming | |
| dc.contributor.author | von Rohr, Alexander | |
| dc.contributor.author | Baumann, Dominik | |
| dc.contributor.author | Xiang, Ji | |
| dc.contributor.author | Trimpe, Sebastian | |
| dc.contributor.department | Department of Electrical Engineering and Automation | en |
| dc.contributor.groupauthor | Cyber-physical Systems | en |
| dc.contributor.organization | Hangzhou City University | |
| dc.contributor.organization | RWTH Aachen University | |
| dc.contributor.organization | Zhejiang University | |
| dc.date.accessioned | 2025-02-26T09:34:27Z | |
| dc.date.available | 2025-02-26T09:34:27Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | How can robots learn and adapt to new tasks and situations with little data? Systematic exploration and simulation are crucial tools for efficient robot learning. We present a novel black-box policy search algorithm focused on data-efficient policy improvements. The algorithm learns directly on the robot and treats simulation as an additional information source to speed up the learning process. At the core of the algorithm, a probabilistic model learns the dependence between the policy parameters and the robot learning objective not only by performing experiments on the robot, but also by leveraging data from a simulator. This substantially reduces interaction time with the robot. Using the model, we can guarantee improvements with high probability for each policy update, thereby facilitating fast, goal-oriented learning. We evaluate our algorithm on simulated fine-tuning tasks and demonstrate the data-efficiency of the proposed dual-information source optimization algorithm. In a real robot learning experiment, we show fast and successful task learning on a robot manipulator with the aid of an imperfect simulator. | en |
| dc.description.version | Peer reviewed | en |
| dc.format.extent | 16 | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | He, S, von Rohr, A, Baumann, D, Xiang, J & Trimpe, S 2025, 'Simulation-Aided Policy Tuning for Black-Box Robot Learning', IEEE Transactions on Robotics, vol. 41. https://doi.org/10.1109/TRO.2025.3539192 | en |
| dc.identifier.doi | 10.1109/TRO.2025.3539192 | |
| dc.identifier.issn | 1552-3098 | |
| dc.identifier.issn | 1941-0468 | |
| dc.identifier.other | PURE UUID: a1cd0d24-9857-4165-a6d4-3f6790e8ec00 | |
| dc.identifier.other | PURE ITEMURL: https://research.aalto.fi/en/publications/a1cd0d24-9857-4165-a6d4-3f6790e8ec00 | |
| dc.identifier.other | PURE LINK: http://adsabs.harvard.edu/abs/2024arXiv241114246H | |
| dc.identifier.other | PURE FILEURL: https://research.aalto.fi/files/179900495/Simulation-Aided_Policy_Tuning_for_Black-Box_Robot_Learning.pdf | |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/134325 | |
| dc.identifier.urn | URN:NBN:fi:aalto-202502262591 | |
| dc.language.iso | en | en |
| dc.publisher | IEEE | |
| dc.relation.ispartofseries | IEEE Transactions on Robotics | en |
| dc.relation.ispartofseries | Volume 41 | en |
| dc.rights | openAccess | en |
| dc.subject.keyword | Computer Science - Machine Learning | |
| dc.subject.keyword | Computer Science - Robotics | |
| dc.subject.keyword | Electrical Engineering and Systems Science - Systems and Control | |
| dc.title | Simulation-Aided Policy Tuning for Black-Box Robot Learning | en |
| dc.type | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä | fi |
| dc.type.version | publishedVersion |