Guided policy search for a lightweight industrial robot arm

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Lundell, Jens
dc.contributor.author White, Jack
dc.date.accessioned 2018-12-21T16:00:17Z
dc.date.available 2018-12-21T16:00:17Z
dc.date.issued 2018-12-17
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/35731
dc.description.abstract General autonomy is at the forefront of robotic research and practice. Earlier research has enabled robots to learn movement and manipulation within the context of a specific instance of a task and to learn from large quantities of empirical data and known dynamics. Reinforcement learning (RL) tackles generalisation, whereby a robot may be relied upon to perform its task with acceptable speed and fidelity in multiple---even arbitrary---task configurations. Recent research has advanced approximate policy search methods of RL, in which a function approximator is used to represent an optimal policy while avoiding calculation across the large dimensions of the state and action spaces of real robots. This thesis details the implementation and testing, on a lightweight industrial robot arm, of guided policy search (GPS), an RL algorithm that seeks to avoid the typical need, in machine learning, for lots of empirical behavioural samples, while maximising learning speed. GPS comprises a local optimal policy generator, here based on a linear-quadratic regulator, and an approximate general policy representation, here a feedforward neural network. A controller is written to interface an existing back-end implementation of GPS and the robot itself. Experimental results show that the GPS agent is able to perform basic reaching tasks across its configuration space with approximately 15 minutes of training, but that the local policies generated fail to be fully optimised within that timescale and that post-training operation suffers from oscillatory actions under perturbed initial joint positions. Further work is discussed and recommended for better training of GPS agents and making locally optimal policies more robust to disturbance while in operation. en
dc.format.extent 58+6
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.title Guided policy search for a lightweight industrial robot arm en
dc.type G2 Pro gradu, diplomityö fi
dc.contributor.school Sähkötekniikan korkeakoulu fi
dc.subject.keyword guided policy search en
dc.subject.keyword reinforcement learning en
dc.subject.keyword deep learning en
dc.subject.keyword robotics en
dc.subject.keyword artificial intelligence en
dc.subject.keyword policy search en
dc.identifier.urn URN:NBN:fi:aalto-201812216740
dc.programme.major Space Robotics and Automation 2017-2018 fi
dc.programme.mcode ELEC3047 fi
dc.type.ontasot Master's thesis en
dc.type.ontasot Diplomityö fi
dc.contributor.supervisor Kyrki, Ville
dc.programme Erasmus Mundus Space Master fi
dc.location P1 fi
local.aalto.electroniconly yes
local.aalto.openaccess yes


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse