Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorWang, Tzu-Jui Juliusen_US
dc.contributor.authorPehlivan, Selenen_US
dc.contributor.authorLaaksonen, Jormaen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Kaski Samuelen
dc.contributor.groupauthorLecturer Laaksonen Jorma groupen
dc.contributor.groupauthorComputer Science Lecturersen
dc.contributor.groupauthorComputer Science - Visual Computing (VisualComputing) - Research areaen
dc.contributor.groupauthorComputer Science - Human-Computer Interaction and Design (HCID) - Research areaen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML) - Research areaen
dc.date.accessioned2023-06-30T09:51:57Z
dc.date.available2023-06-30T09:51:57Z
dc.date.issued2020en_US
dc.description| openaire: EC/H2020/780069/EU//MeMAD
dc.description.abstractPredicting a scene graph that captures visual entities and their interactions in an image has been considered a crucial step towards full scene comprehension. Recent scene graph generation (SGG) models have shown their capability of capturing the most frequent relations among visual entities. However, the state-of-the-art results are still far from satisfactory, e.g. models can obtain 31% in overall recall R@100, whereas the likewise important mean class-wise recall mR@100 is only around 8% on Visual Genome (VG). The discrepancy between R and mR results urges to shift the focus from pursuing a high R to a high mR with a still competitive R. We suspect that the observed discrepancy stems from both the annotation bias and sparse annotations in VG, in which many visual entity pairs are either not annotated at all or only with a single relation when multiple ones could be valid. To address this particular issue, we propose a novel SGG training scheme that capitalizes on self-learned knowledge. It involves two relation classifiers, one offering a less biased setting for the other to base on. The proposed scheme can be applied to most of the existing SGG models and is straightforward to implement. We observe significant relative improvements in mR (between +6.6% and +20.4%) and competitive or better R (between -2.4% and 0.3%) across all standard SGG tasks.en
dc.format.extent13
dc.identifier.citationWang, T-J J, Pehlivan, S & Laaksonen, J 2020, Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models. in Proceedings of the British Machine Vision Conference (BMVC). British Machine Vision Association, British Machine Vision Conference, Virtual, Online, United Kingdom, 07/09/2020. < https://www.bmvc2020-conference.com/conference/papers/paper_0541.html >en
dc.identifier.otherPURE UUID: 5370c65a-1d2e-4ce8-b5e3-4c5b20290b81en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/5370c65a-1d2e-4ce8-b5e3-4c5b20290b81en_US
dc.identifier.otherPURE LINK: https://www.bmvc2020-conference.com/conference/papers/paper_0541.htmlen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/121987
dc.identifier.urnURN:NBN:fi:aalto-202306304355
dc.language.isoenen
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/780069/EU//MeMADen_US
dc.relation.ispartofBritish Machine Vision Conferenceen
dc.relation.ispartofseriesProceedings of the British Machine Vision Conference (BMVC)en
dc.rightsopenAccessen
dc.titleTackling the Unannotated: Scene Graph Generation with Bias-Reduced Modelsen
dc.typeA4 Artikkeli konferenssijulkaisussafi

Files