Combining Textual and Visual Modeling for Predicting Media Memorability

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A3 Kirjan tai muun kokoomateoksen osa

Date

2019-10-27

Major/Subject

Mcode

Degree programme

Language

en

Pages

Series

CEUR Workshop Proceedings ; Volume 2670

Abstract

This paper describes a multimodal approach proposed by the MeMAD team for the MediaEval 2019 “Predicting Media memorability” task. Our best approach is a weighted average method combining predictions made separately from visual and textual representations of videos. In particular, we augmented the provided textual descriptions with automatically generated deep captions. For long term memorability, we obtained better scores using the short term predictions rather than the long term ones. Our best model achieves Spearman scores of 0.522 and 0.277 respectively for the short and long term predictions tasks.

Description

| openaire: EC/H2020/780069/EU//MeMAD

Keywords

Other note

Citation

Reboud, A, Harrando, I, Laaksonen, J, Francis, D, Troncy, R & Laria Mantecon, H 2019, Combining Textual and Visual Modeling for Predicting Media Memorability . in Working Notes Proceedings of the MediaEval 2019 Workshop, Sophia Antipolis, France, 27-30 October 2019 . CEUR Workshop Proceedings, vol. 2670, CEUR, Multimedia Benchmark Workshop, Sophia Antipolis, France, 27/10/2019 . < http://ceur-ws.org/Vol-2670/MediaEval_19_paper_26.pdf >