Speeding Up Incremental Learning Using Data Efficient Guided Exploration

Loading...
Thumbnail Image

Access rights

openAccess
acceptedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

8

Series

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, ICRA 2018, pp. 5082-5089, IEEE International Conference on Robotics and Automation

Abstract

To cope with varying conditions, motor primitives (MPs) must support generalization over task parameters to avoid learning separate primitives for each situation. In this regard, deterministic and probabilistic models have been proposed for generalizing MPs to new task parameters, thus providing limited generalization. Although generalization of MPs using probabilistic models has been studied, it is not clear how such generalizable models can be learned efficiently. Reinforcement learning can be more efficient when the exploration process is tuned with data uncertainty, thus reducing unnecessary exploration in a data-efficient way. We propose an empirical Bayes method to predict uncertainty and utilize it for guiding the exploration process of an incremental learning framework. The online incremental learning framework uses a single human demonstration for constructing a database of MPs. The main ingredients of the proposed framework are a global parametric model (GPDMP) for generalizing MPs for new situations, a model-free policy search agent for optimizing the failed predicted MPs, model selection for controlling the complexity of GPDMP, and empirical Bayes for extracting the uncertainty of MPs prediction. Experiments with a ball-in-a-cup task demonstrate that the global GPDMP model generalizes significantly better than linear models and Locally Weighted Regression especially in terms of extrapolation capability. Furthermore, the model selection has successfully identified the required complexity of GPDMP even with few training samples while satisfying the Occam Razor’s prinicple. Above all, the uncertainty predicted by the proposed empirical Bayes approach successfully guided the exploration process of the model-free policy search. The experiments indicated statistically significant improvement of learning speed over covariance matrix adaptation (CMA) with a significance of p = 0.002.

Description

Other note

Citation

Hazara, M & Kyrki, V 2018, Speeding Up Incremental Learning Using Data Efficient Guided Exploration. in Proceedings of the 2018 IEEE International Conference on Robotics and Automation, ICRA 2018. IEEE International Conference on Robotics and Automation, IEEE, pp. 5082-5089, IEEE International Conference on Robotics and Automation, Brisbane, Australia, 21/05/2018. https://doi.org/10.1109/ICRA.2018.8461241