Learning Centre

Real-time Action Recognition for RGB-­D and Motion Capture Data

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Koskela, Markus, Dr., University of Helsinki, Department of Computer Science, Finland
dc.contributor.author Chen, Xi
dc.date.accessioned 2014-12-18T10:00:21Z
dc.date.available 2014-12-18T10:00:21Z
dc.date.issued 2014
dc.identifier.isbn 978-952-60-6014-9 (electronic)
dc.identifier.isbn 978-952-60-6013-2 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/14715
dc.description.abstract In daily life humans perform a great number of actions continuously. We recognize and interpret these actions unconsciously while interacting and communicating with people and the environment. If the machines and computers could also recognize human gestures as effectively as human beings, a new world would be unfolded, filled with a large number of applications to facilitate our daily life. These significant benefits for the society have motivated the research on machine-based gesture recognition, which has already shown some initial advantages in many applications. For example, gestures can be used as commands to control robots or computer programs instead of using standard input devices such as touch screens or mice. This thesis proposes a framework for gesture recognition systems based on motion capture and RGB-D data. Motion capture data consists of positions and orientations of the key joints of the human skeleton. RGB-D data contains the RGB image and depth data from which a skeletal model can be learnt. This skeletal model can be seen as a noisy approximation of the more accurate motion capture skeleton model. The modular design of our framework enables convenient recognition using multiple data modalities. The first part of the thesis introduces various methods used in existing recognition systems in the literature and a brief introduction of the proposed real-time recognition system for both whole body gestures and hand gestures. The second part of the thesis is a collection of eight publications by the author of the thesis. Detailed information about the proposed recognition system can be found in these publications. In general, the framework can be roughly divided into two parts, feature extraction and classification. Both have significant influence on the recognition performance. Multiple features are developed and extracted from the skeletons, images, and depth data for each frame in the motion sequence. These features are combined in the early fusion stage, and classified by a single hidden layer neural network - extreme learning machine. The frame-level classification outputs are then aggregated on the sequence level to obtain the final classification result. The methodologies used in the gesture recognition system are also applied in a proposed image retrieval system. Several image features are extracted and search algorithms are applied to achieve a fast and accurate retrieval. Furthermore, a method is also proposed to align different motion sequences and to evaluate the alignment. The method can be used for gesture retrieval and for skeleton generation algorithm evaluation. en
dc.format.extent 104 + app. 87
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 207/2014
dc.relation.haspart [Publication 1]: Xi Chen and Markus Koskela and Jouko Hyvakka. Image Based Information Access for Mobile Phones. In Proceedings of 8th International Workshop on Content-Based Multimedia Indexing (CBMI2010), pages 1-5, Grenoble, France, June 2010.
dc.relation.haspart [Publication 2]: Xi Chen and Markus Koskela. Mobile Visual Search from Dynamic Image Databases. In Proceedings of 17th Scandinavian Conference on Image Analysis (SCIA 2011), pages 196-205, Ystad, Sweden, May 2011.
dc.relation.haspart [Publication 3]: Xi Chen and Markus Koskela. Classification of RGB-D and Motion Capture Sequences Using Extreme Learning Machine. In Proceedings of 18th Scandinavian Conference on Image Analysis (SCIA 2013), pages 640-651, Espoo, Finland, June 2013.
dc.relation.haspart [Publication 4]: Xi Chen and Markus Koskela. Skeleton-Based Action Recognition with Extreme Learning Machines. Neurocomputing, Volume 149, Part A, Pages 387-396, February 2015.
dc.relation.haspart [Publication 5]: Xi Chen and Markus Koskela. Sequence Alignment for RGB-D and Motion Capture Skeletons. In Proceedings of the International Conference on Image Analysis and Recognition (ICIAR 2013), pages 630-639, Povoa de Varzim, Portugal, June 2013.
dc.relation.haspart [Publication 6]: Kyunghyun Cho and Xi Chen. Classifying and Visualizing Motion Capture Sequences using Deep Neural Networks. In Proceedings of the 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pages 122-130, Lisbon, Portugal, January 2014.
dc.relation.haspart [Publication 7]: Xi Chen and Markus Koskela. Online RGB-D Gesture Recognition with Extreme Learning Machines. In Proceedings of the 15th ACM International Conference on Multimodal Interaction (ICMI 2013), pages 467-474, Sydney, Australia, December 2013.
dc.relation.haspart [Publication 8]: Xi Chen and Markus Koskela. Using Appearance-Based Hand Features For Dynamic RGB-D Gesture Recognition. In Proceedings of the 22nd International Conference on Pattern Recognition (ICPR14), pages 411-416, Stockholm, Sweden, August 2014.
dc.subject.other Computer science en
dc.title Real-time Action Recognition for RGB-­D and Motion Capture Data en
dc.type G5 Artikkeliväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword action recognition en
dc.subject.keyword gesture recognition en
dc.subject.keyword RGB-D en
dc.subject.keyword motion capture en
dc.subject.keyword extreme learning machine en
dc.subject.keyword computer vision en
dc.subject.keyword machine learning en
dc.subject.keyword image retrieval en
dc.identifier.urn URN:ISBN:978-952-60-6014-9
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Oja, Erkki, Aalto Distinguished Prof., Aalto University, Department of Information and Computer science, Finland
dc.opn Laptev, Ivan, Dr., INRIA research director, France
dc.rev Zhao, Guoying, Prof., University of Oulu, Finland
dc.rev Athitsos, Vassilis, Prof., University of Texas at Arlington, USA
dc.date.defence 2015-01-16
local.aalto.digifolder Aalto_64245

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication