Combining Textual and Visual Modeling for Predicting Media Memorability
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A3 Kirjan tai muun kokoomateoksen osa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2019-10-27
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
Series
CEUR Workshop Proceedings ; Volume 2670
Abstract
This paper describes a multimodal approach proposed by the MeMAD team for the MediaEval 2019 “Predicting Media memorability” task. Our best approach is a weighted average method combining predictions made separately from visual and textual representations of videos. In particular, we augmented the provided textual descriptions with automatically generated deep captions. For long term memorability, we obtained better scores using the short term predictions rather than the long term ones. Our best model achieves Spearman scores of 0.522 and 0.277 respectively for the short and long term predictions tasks.Description
| openaire: EC/H2020/780069/EU//MeMAD
Keywords
Other note
Citation
Reboud, A, Harrando, I, Laaksonen, J, Francis, D, Troncy, R & Laria Mantecon, H 2019, Combining Textual and Visual Modeling for Predicting Media Memorability . in Working Notes Proceedings of the MediaEval 2019 Workshop, Sophia Antipolis, France, 27-30 October 2019 . CEUR Workshop Proceedings, vol. 2670, CEUR, Multimedia Benchmark Workshop, Sophia Antipolis, France, 27/10/2019 . < http://ceur-ws.org/Vol-2670/MediaEval_19_paper_26.pdf >