Deep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.author | Laskar, Zakaria | |
dc.contributor.department | Tietotekniikan laitos | fi |
dc.contributor.department | Department of Computer Science | en |
dc.contributor.school | Perustieteiden korkeakoulu | fi |
dc.contributor.school | School of Science | en |
dc.contributor.supervisor | Kannala, Juho, Prof., Aalto University, Department of Computer Science, Finland | |
dc.date.accessioned | 2020-11-13T10:01:33Z | |
dc.date.available | 2020-11-13T10:01:33Z | |
dc.date.defence | 2020-12-09 | |
dc.date.issued | 2020 | |
dc.description.abstract | Image matching is a central component in many computer vision applications. The field has progressed significantly with the advancement of deep learning models such as convolutional neural networks. The thesis makes several contributions in advancing the performance of existing CNN based approaches in closely related problem areas of image matching, namely semantic matching, image retrieval and image based localization. In this thesis, the problem of data and ground-truth labelling efficiency for training CNN models is studied in the context of semantic matching. A weakly supervised method is presented to address the problem of learning using small training datasets. The method first generates additional training samples using existing data and proposes a novel loss function based on cyclic consistency to regularize the training process. Results show that the proposed method can learn from weakly labelled data without pixel level correspondence information. In the next part of the thesis, we study the application of both global and local image matching to the problem of image retrieval. In the problem of particular landmark retrieval the thesis studies the role of contextual information in global query image representation which is generally ignored by existing approaches to remove noisy background information. An attention model is proposed that uses bottom-up saliency to modulate contextual information in intermediary CNN representations in a top-down manner. On the other hand, to address the challenges due to local variations in city-scale retrieval, the thesis proposes a geometric verification method using CNN based image matching. In addition, it proposes method for improving the accuracy and efficiency of the image matching method. Lastly, the thesis demonstrates methods utilizing the key concepts from image matching and image retrieval to address problems in the field of image based localization. In contrast to existing approaches the proposed method can be applied to novel scenes not seen during training and scales favourably with the size of the environment. In addition, a challenging indoor localization dataset is made publicly available to address limitation of existing datasets. | en |
dc.format.extent | 70 + app. 68 | |
dc.format.mimetype | application/pdf | en |
dc.identifier.isbn | 978-952-64-0146-1 (electronic) | |
dc.identifier.isbn | 978-952-64-0145-4 (printed) | |
dc.identifier.issn | 1799-4942 (electronic) | |
dc.identifier.issn | 1799-4934 (printed) | |
dc.identifier.issn | 1799-4934 (ISSN-L) | |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/47626 | |
dc.identifier.urn | URN:ISBN:978-952-64-0146-1 | |
dc.language.iso | en | en |
dc.opn | Tolias, Giorgos, Prof., Czech Technical University, Czech Republic | |
dc.publisher | Aalto University | en |
dc.publisher | Aalto-yliopisto | fi |
dc.relation.haspart | [Publication 1]: Zakaria Laskar, and Juho Kannala. Semi-supervised Semantic Matching. European Conference on Computer Vision. Geometry Meets Deep Learning Workshop (ECCVW), pp. 444–455, 2018. DOI: 10.1007/978-3-030-11015-4_32 | |
dc.relation.haspart | [Publication 2]: Zakaria Laskar, Hamed Rezazadegan Tavakoli, and Juho Kannala. Semantic Matching by Weakly Supervised 2D Point Set Registration. Winter Conference on Applications of Computer Vision (WACV), pp. 1061– 1069, December 2019. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201907304541. DOI: 10.1109/WACV.2019.00118 | |
dc.relation.haspart | [Publication 3]: Zakaria Laskar, and Juho Kannala. Context Aware Query Image Representation for Particular Object Retrieval. Scandinivian Conference on Image Analysis (SCIA), pp. 88–99, January, 2017. DOI: 10.1007/978-3-319-59129-2_8 | |
dc.relation.haspart | [Publication 4]: Zakaria Laskar, Iaroslav Melekhov, Hamed Rezazadegan Tavakoli, Juha Ylioinas, and Juho Kannala. Geometric Image Correspondence Verification by Dense Pixel Matching. Winter Conference on Applications of Computer Vision (WACV), pp. 2510–2519, 2020 | |
dc.relation.haspart | [Publication 5]: Zakaria Laskar, Iaroslav Melekhov, Surya Kalia, and Juho Kannala. Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Networks. International Conference on Computer Vision. Geometry Meets Deep Learning Workshop (ICCVW), pp. 929–938, 2017. DOI: 10.1109/ICCVW.2017.113 | |
dc.relation.ispartofseries | Aalto University publication series DOCTORAL DISSERTATIONS | en |
dc.relation.ispartofseries | 191/2020 | |
dc.rev | Tolias, Giorgos, Prof., Czech Technical University, Czech Republic | |
dc.rev | Arandjelović, Relja, Dr., Deepmind, United Kingdom | |
dc.subject.keyword | computer vision | en |
dc.subject.keyword | machine learning | en |
dc.subject.keyword | deep learning | en |
dc.subject.keyword | camera relocalization | en |
dc.subject.keyword | image retrieval | en |
dc.subject.keyword | image matching | en |
dc.subject.other | Computer science | en |
dc.title | Deep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization | en |
dc.type | G5 Artikkeliväitöskirja | fi |
dc.type.dcmitype | text | en |
dc.type.ontasot | Doctoral dissertation (article-based) | en |
dc.type.ontasot | Väitöskirja (artikkeli) | fi |
local.aalto.acrisexportstatus | checked 2020-12-28_1909 | |
local.aalto.archive | yes | |
local.aalto.formfolder | 2020_11_13_klo_11_05 | |
local.aalto.infra | Science-IT |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- isbn9789526401461.pdf
- Size:
- 13.18 MB
- Format:
- Adobe Portable Document Format