Anchor-Free Action Proposal Network with Uncertainty Estimation

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorPehlivan, Selenen_US
dc.contributor.authorLaaksonen, Jormaen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorLecturer Laaksonen Jorma groupen
dc.contributor.groupauthorComputer Science Lecturersen
dc.contributor.groupauthorComputer Science - Visual Computing (VisualComputing) - Research areaen
dc.contributor.groupauthorComputer Science - Human-Computer Interaction and Design (HCID) - Research areaen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML) - Research areaen
dc.date.accessioned2024-01-04T08:47:35Z
dc.date.available2024-01-04T08:47:35Z
dc.date.issued2023en_US
dc.descriptionFunding Information: This work has been funded by the Academy of Finland project numbers 329268 and 345791. The computational resources have been provided by the Aalto University’s Aalto Science-IT project and the CSC–IT Center for Science. Publisher Copyright: © 2023 IEEE.
dc.description.abstractProposal generation is a fundamental yet challenging task for two-stage temporal action detection pipelines. The task aims at predicting starting and ending boundaries of segments in realistic video sequences and action recognition methods cannot be directly applied to such videos due to their untrimmed nature. Most state-of-the-art models rely on temporal convolutional neural networks with pre-defined anchor segments. By eliminating anchors, we propose a lighter end-to-end trainable Anchor-Free Multiscale Transformer-based Generator (AMTG) model using local clues via video snippets. To improve effectiveness for temporal evaluation, we apply multiscale Transformer encoders to sequences with a bi-directional mask extension that simultaneously predicts boundary distances with uncertainties and various snippet-based local scores. Later, our model integrates local predictions to generate proposal candidates using the proposed scoring function. Experiments on the THUMOS14 and ActivityNet-1.3 benchmarks demonstrate the effectiveness of AMTG for the temporal proposal generation task.en
dc.description.versionPeer revieweden
dc.format.extent6
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationPehlivan, S & Laaksonen, J 2023, Anchor-Free Action Proposal Network with Uncertainty Estimation. in Proceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023. Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2023-July, IEEE, pp. 1853-1858, IEEE International Conference on Multimedia and Expo, Brisbane, Australia, 10/07/2023. https://doi.org/10.1109/ICME55011.2023.00318en
dc.identifier.doi10.1109/ICME55011.2023.00318en_US
dc.identifier.isbn978-1-6654-6891-6
dc.identifier.issn1945-7871
dc.identifier.issn1945-788X
dc.identifier.otherPURE UUID: 2fe8a99f-28e0-4013-9f92-d55e38491ff8en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/2fe8a99f-28e0-4013-9f92-d55e38491ff8en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85171171618&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/130890725/Anchor-Free_Action_Proposal_Network_with_Uncertainty_Estimation.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/125381
dc.identifier.urnURN:NBN:fi:aalto-202401041070
dc.language.isoenen
dc.relation.ispartofIEEE International Conference on Multimedia and Expoen
dc.relation.ispartofseriesProceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023en
dc.relation.ispartofseriespp. 1853-1858en
dc.relation.ispartofseriesProceedings - IEEE International Conference on Multimedia and Expo ; Volume 2023-Julyen
dc.rightsopenAccessen
dc.subject.keywordanchor-freeen_US
dc.subject.keywordmultiscale transformer networken_US
dc.subject.keywordtemporal action proposalsen_US
dc.subject.keywordtwo-stage detectorsen_US
dc.titleAnchor-Free Action Proposal Network with Uncertainty Estimationen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionacceptedVersion

Files