Temporal teacher with masked transformers for semi-supervised action proposal generation

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorPehlivan, Selenen_US
dc.contributor.authorLaaksonen, Jormaen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorLecturer Laaksonen Jorma groupen
dc.contributor.groupauthorComputer Science Lecturersen
dc.contributor.groupauthorComputer Science - Visual Computing (VisualComputing)en
dc.contributor.groupauthorComputer Science - Human-Computer Interaction and Design (HCID)en
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML)en
dc.date.accessioned2024-03-27T08:00:08Z
dc.date.available2024-03-27T08:00:08Z
dc.date.issued2024-03-15en_US
dc.descriptionPublisher Copyright: © The Author(s) 2024.
dc.description.abstractBy conditioning on unit-level predictions, anchor-free models for action proposal generation have displayed impressive capabilities, such as having a lightweight architecture. However, task performance depends significantly on the quality of data used in training, and most effective models have relied on human-annotated data. Semi-supervised learning, i.e., jointly training deep neural networks with a labeled dataset as well as an unlabeled dataset, has made significant progress recently. Existing works have either primarily focused on classification tasks, which may require less annotation effort, or considered anchor-based detection models. Inspired by recent advances in semi-supervised methods on anchor-free object detectors, we propose a teacher-student framework for a two-stage action detection pipeline, named Temporal Teacher with Masked Transformers (TTMT), to generate high-quality action proposals based on an anchor-free transformer model. Leveraging consistency learning as one self-training technique, the model jointly trains an anchor-free student model and a gradually progressing teacher counterpart in a mutually beneficial manner. As the core model, we design a Transformer-based anchor-free model to improve effectiveness for temporal evaluation. We integrate bi-directional masks and devise encoder-only Masked Transformers for sequences. Jointly training on boundary locations and various local snippet-based features, our model predicts via the proposed scoring function for generating proposal candidates. Experiments on the THUMOS14 and ActivityNet-1.3 benchmarks demonstrate the effectiveness of our model for temporal proposal generation task.en
dc.description.versionPeer revieweden
dc.format.extent15
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationPehlivan, S & Laaksonen, J 2024, ' Temporal teacher with masked transformers for semi-supervised action proposal generation ', MACHINE VISION AND APPLICATIONS, vol. 35, no. 3, 36, pp. 1-15 . https://doi.org/10.1007/s00138-024-01521-7en
dc.identifier.doi10.1007/s00138-024-01521-7en_US
dc.identifier.issn0932-8092
dc.identifier.issn1432-1769
dc.identifier.otherPURE UUID: b36c4a7b-1617-43c0-a746-b38632c51824en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/b36c4a7b-1617-43c0-a746-b38632c51824en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85187783451&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/142128602/Temporal_teacher_with_masked_transformers_for_semi-supervised_action_proposal_generation.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/127297
dc.identifier.urnURN:NBN:fi:aalto-202403272930
dc.language.isoenen
dc.publisherSpringer
dc.relation.ispartofseriesMACHINE VISION AND APPLICATIONS
dc.relation.ispartofseriesVolume 35, issue 3, pp. 1-15
dc.rightsopenAccessen
dc.subject.keywordAnchor-free modelen_US
dc.subject.keywordSemi-supervised learningen_US
dc.subject.keywordTemporal proposal generationen_US
dc.subject.keywordTransformer networken_US
dc.titleTemporal teacher with masked transformers for semi-supervised action proposal generationen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files