Speeding Up Incremental Learning Using Data Efficient Guided Exploration
Loading...
Access rights
openAccess
acceptedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Authors
Date
Major/Subject
Mcode
Degree programme
Language
en
Pages
8
Series
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, ICRA 2018, pp. 5082-5089, IEEE International Conference on Robotics and Automation
Abstract
To cope with varying conditions, motor primitives (MPs) must support generalization over task parameters to avoid learning separate primitives for each situation. In this regard, deterministic and probabilistic models have been proposed for generalizing MPs to new task parameters, thus providing limited generalization. Although generalization of MPs using probabilistic models has been studied, it is not clear how such generalizable models can be learned efficiently. Reinforcement learning can be more efficient when the exploration process is tuned with data uncertainty, thus reducing unnecessary exploration in a data-efficient way. We propose an empirical Bayes method to predict uncertainty and utilize it for guiding the exploration process of an incremental learning framework. The online incremental learning framework uses a single human demonstration for constructing a database of MPs. The main ingredients of the proposed framework are a global parametric model (GPDMP) for generalizing MPs for new situations, a model-free policy search agent for optimizing the failed predicted MPs, model selection for controlling the complexity of GPDMP, and empirical Bayes for extracting the uncertainty of MPs prediction. Experiments with a ball-in-a-cup task demonstrate that the global GPDMP model generalizes significantly better than linear models and Locally Weighted Regression especially in terms of extrapolation capability. Furthermore, the model selection has successfully identified the required complexity of GPDMP even with few training samples while satisfying the Occam Razor’s prinicple. Above all, the uncertainty predicted by the proposed empirical Bayes approach successfully guided the exploration process of the model-free policy search. The experiments indicated statistically significant improvement of learning speed over covariance matrix adaptation (CMA) with a significance of p = 0.002.Description
Other note
Citation
Hazara, M & Kyrki, V 2018, Speeding Up Incremental Learning Using Data Efficient Guided Exploration. in Proceedings of the 2018 IEEE International Conference on Robotics and Automation, ICRA 2018. IEEE International Conference on Robotics and Automation, IEEE, pp. 5082-5089, IEEE International Conference on Robotics and Automation, Brisbane, Australia, 21/05/2018. https://doi.org/10.1109/ICRA.2018.8461241