Learning Transparent Reward Models via Unsupervised Feature Selection

Loading...
Thumbnail Image

Access rights

openAccess
CC BY
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Authors

Baimukashev, Daulet
Alcan, Gökhan
Luck, Kevin Sebastian
Kyrki, Ville

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

14

Series

Proceedings of Machine Learning Research, Volume 270

Abstract

In complex real-world tasks such as robotic manipulation and autonomous driving, collecting expert demonstrations is often more straightforward than specifying precise learning objectives and task descriptions. Learning from expert data can be achieved through behavioral cloning or by learning a reward function, i.e., inverse reinforcement learning. The latter allows for training with additional data outside the training distribution, guided by the inferred reward function. We propose a novel approach to construct compact and transparent reward models from automatically selected state features. These inferred rewards have an explicit form and enable the learning of policies that closely match expert behavior by training standard reinforcement learning algorithms from scratch. We validate our method's performance in various robotic environments with continuous and high-dimensional state spaces. Webpage: \url{https://sites.google.com/view/transparent-reward}.

Description

Keywords

Other note

Citation

Baimukashev, D, Alcan, G, Luck, K S & Kyrki, V 2025, 'Learning Transparent Reward Models via Unsupervised Feature Selection', Proceedings of Machine Learning Research, vol. 270. < https://proceedings.mlr.press/v270/baimukashev25a.html >