Deep Contextual Attention for Human-Object Interaction Detection
Loading...
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2020-02
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
5694-5702
Series
Proceedings of the International Conference on Computer Vision (ICCV2019), Proceedings of the IEEE International Conference on Computer Vision, Volume 2019-October
Abstract
This work proposes to combine neural networks with the compositional hierarchy of human bodies for efficient and complete human parsing. We formulate the approach as a neural information fusion framework. Our model assembles the information from three inference processes over the hierarchy: direct inference (directly predicting each part of a human body using image information), bottom-up inference (assembling knowledge from constituent parts), and top-down inference (leveraging context from parent nodes). The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively. In addition, the fusion of multi-source information is conditioned on the inputs, i.e., by estimating and considering the confidence of the sources. The whole model is end-to-end differentiable, explicitly modeling information flows and structures. Our approach is extensively evaluated on four popular datasets, outperforming the state-of-the-arts in all cases, with a fast processing speed of 23fps. Our code and results have been released to help ease future research in this direction.Description
ETSI ISBN!!!
Keywords
Other note
Citation
Wang, T, Anwer, R M, Khan, M H, Khan, F S, Pang, Y, Shao, L & Laaksonen, J 2020, Deep Contextual Attention for Human-Object Interaction Detection . in Proceedings of the International Conference on Computer Vision (ICCV2019) ., 9008846, Proceedings of the IEEE International Conference on Computer Vision, vol. 2019-October, IEEE, pp. 5693-5701, IEEE International Conference on Computer Vision, Seoul, Korea, Republic of, 27/10/2019 . https://doi.org/10.1109/ICCV.2019.00579