Incremental and Transfer Learning of Contextual Skill Model for Robots
School of Electrical Engineering | Doctoral thesis (article-based) | Defence date: 2019-11-20
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
200 + app. 70
Aalto University publication series DOCTORAL DISSERTATIONS, 200/2019
AbstractThe thesis studies building blocks for robot skill learning. Using these key components, learning frameworks can be constructed which provide robots with the capability to acquire a motion and manipulation skill autonomously. We study skill learning in two contexts: in-contact and free-space motions. In brief, this thesis investigates how to: (1) learn a policy for in-contact tasks; (2) generalize a free-space motion policy to new situations using a contextual skill model (CSM); and (3) transfer the CSM from simulation to real world. Learning an in-contact task such as wood planing from scratch can be time-consuming and dangerous. This problem can be avoided by imitating a policy from a human demonstration. However, a mere imitation may not satisfy the objective of the corresponding in-contact task. The thesis proposes a reinforcement learning (RL) framework for improving the performance of an imitated in-contact policy. The policy search for in-contact tasks has been achieved by making the motion compliant which allows for exploration in the force profile. Generalizing a policy to new situations is fundamental to skill learning as it alleviates the need to learn a new policy in every novel situation. Generalizing a policy refers to synthesizing a function mapping the policy to new situations. The function is referred to as a contextual policy or contextual skill model (CSM). The thesis proposes a parametric CSM. Experiments demonstrated that the parametric CSM can extract a global pattern from a database (DB) of policy parameters leading to significantly better extrapolation capability than with non-parametric CSMs. Furthermore, the underlying model of the CSM is fitted to the DB using a novel model selection approach to better represent the underlying regularities of the task. In order to speed the process of learning, the prediction uncertainty of the CSM is calculated using empirical Bayes (EB) and employed for guiding the exploration process of a model-free policy search. In addition, the most promising task is selected using a novel task manager, allowing for better future generalization performance achieved with minimum effort. In essence, the thesis presents an incremental learning framework,the main components of which are as follows: CSM, policy search, model selection, DB, EB, and a task manager implemented using active learning. Learning a policy in a simulated environment and transferring it to the real world will alleviate the need to learn from scratch or from a demonstration. The thesis proposes to transfer a CSM instead of transferring a single control policy. We developed a simulation-to-real transfer framework which learns a source CSM in simulation incrementally and transfers it to the real world incrementally. Transference of the source CSM has been achieved using sample policies from the target environment. Experiments indicated that one sample policy is sufficient to transfer a CSM to the target environment. The target CSM improved the extrapolation capability significantly better than zero-shot transfer.
Supervising professorKyrki, Ville, Prof., Aalto University, Department of Electrical Engineering and Automation, Finland
robotics, reinforcement learning, active incremental learning, transfer learning
[Publication 1]: Murtaza Hazara, Ville Kyrki. Reinforcement learning for improving imitated in-contact skills. In International Conference on Humanoid Robots (Humanoids), Cancun,Mexico. pp. 194-201, 11, 2016.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201612165983DOI: 10.1109/HUMANOIDS.2016.7803277 View at publisher
[Publication 2]: Jens Lundell, Murtaza Hazara, Ville Kyrki. Generalizing Movement Primitives to New Situations. In Towards Autonomous Robotic Systems (TAROS), pp. 16-31, 7, 2017.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201905062923DOI: 10.1007/978-3-319-64107-2_2 View at publisher
[Publication 3]: Murtaza Hazara, Ville Kyrki. Model selection for incremental learning of generalizable movement primitives. In 18th International Conference on Advanced Robotics (ICAR), pp. 359-366, 7, 2017.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201905062901DOI: 10.1109/ICAR.2017.8023633 View at publisher
[Publication 4]: Murtaza Hazara, Ville Kyrki. Speeding Up Incremental Learning Using Data Efficient Guided Exploration. In International Conference on Robotics and Automation (ICRA), pp. 1-8, 5, 2018.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201901301489DOI: 10.1109/ICRA.2018.8461241 View at publisher
[Publication 5]: Murtaza Hazara, Ville Kyrki. Transferring Generalizable Motor Primitives From Simulation to Real World. Robotics and Automation Letters, pp. 2172-2179, 4, 2019.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201905062782DOI: 10.1109/LRA.2019.2900768 View at publisher
- [Publication 6]: Murtaza Hazara, Xiaopu Li, Ville Kyrki. Active Incremental Learning of a Contextual Skill Model. Submitted to International Conference on Intelligent Robots and Systems (IROS), 2019 IEEE.