Learning Task-Agnostic Action Spaces for Movement Optimization
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.author | Babadi, Amin | en_US |
dc.contributor.author | Van de Panne, Michiel | en_US |
dc.contributor.author | Liu, Caren | en_US |
dc.contributor.author | Hamalainen, Perttu | en_US |
dc.contributor.department | Department of Computer Science | en |
dc.contributor.department | Department of Media | en |
dc.contributor.groupauthor | Professorship Hämäläinen Perttu | en |
dc.contributor.groupauthor | Computer Science Professors | en |
dc.contributor.groupauthor | Computer Science - Visual Computing (VisualComputing) | en |
dc.contributor.organization | University of British Columbia | en_US |
dc.contributor.organization | Stanford University | en_US |
dc.date.accessioned | 2022-12-14T10:16:40Z | |
dc.date.available | 2022-12-14T10:16:40Z | |
dc.date.issued | 2022-12 | en_US |
dc.description | Publisher Copyright: CCBY | |
dc.description.abstract | We propose a novel method for exploring the dynamics of physically based animated characters, and learning a task-agnostic action space that makes movement optimization easier. Like several previous papers, we parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets. Our novel contribution is that with our exploration data, we are able to learn the low-level policy in a generic manner and without any reference movement data. Trained once for each agent or simulation environment, the policy improves the efficiency of optimizing both trajectories and high-level policies across multiple tasks and optimization algorithms. We also contribute novel visualizations that show how using target states as actions makes optimized trajectories more robust to disturbances; this manifests as wider optima that are easy to find. Due to its simplicity and generality, our proposed approach should provide a building block that can improve a large variety of movement optimization methods and applications. | en |
dc.description.version | Peer reviewed | en |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Babadi, A, Van de Panne, M, Liu, C & Hamalainen, P 2022, ' Learning Task-Agnostic Action Spaces for Movement Optimization ', IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 4700-4712 . https://doi.org/10.1109/TVCG.2021.3100095 | en |
dc.identifier.doi | 10.1109/TVCG.2021.3100095 | en_US |
dc.identifier.issn | 1077-2626 | |
dc.identifier.other | PURE UUID: 5e581a1e-4ab5-48af-8f74-ec3fce85cdda | en_US |
dc.identifier.other | PURE ITEMURL: https://research.aalto.fi/en/publications/5e581a1e-4ab5-48af-8f74-ec3fce85cdda | en_US |
dc.identifier.other | PURE LINK: http://www.scopus.com/inward/record.url?scp=85112599716&partnerID=8YFLogxK | en_US |
dc.identifier.other | PURE FILEURL: https://research.aalto.fi/files/94735096/Learning_Task_Agnostic_Action_Spaces_for_Movement_Optimization.pdf | en_US |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/118150 | |
dc.identifier.urn | URN:NBN:fi:aalto-202212146890 | |
dc.language.iso | en | en |
dc.publisher | IEEE Computer Society | |
dc.relation.ispartofseries | IEEE Transactions on Visualization and Computer Graphics | en |
dc.rights | openAccess | en |
dc.subject.keyword | action space | en_US |
dc.subject.keyword | Aerospace electronics | en_US |
dc.subject.keyword | hierarchical reinforcement learning | en_US |
dc.subject.keyword | movement optimization | en_US |
dc.subject.keyword | Optimization | en_US |
dc.subject.keyword | policy optimization | en_US |
dc.subject.keyword | Reinforcement learning | en_US |
dc.subject.keyword | Splines (mathematics) | en_US |
dc.subject.keyword | Task analysis | en_US |
dc.subject.keyword | Training | en_US |
dc.subject.keyword | trajectory optimization | en_US |
dc.subject.keyword | Trajectory optimization | en_US |
dc.title | Learning Task-Agnostic Action Spaces for Movement Optimization | en |
dc.type | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä | fi |
dc.type.version | publishedVersion |