Deep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization
School of Science | Doctoral thesis (article-based) | Defence date: 2020-12-09
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
70 + app. 68
Aalto University publication series DOCTORAL DISSERTATIONS, 191/2020
AbstractImage matching is a central component in many computer vision applications. The field has progressed significantly with the advancement of deep learning models such as convolutional neural networks. The thesis makes several contributions in advancing the performance of existing CNN based approaches in closely related problem areas of image matching, namely semantic matching, image retrieval and image based localization. In this thesis, the problem of data and ground-truth labelling efficiency for training CNN models is studied in the context of semantic matching. A weakly supervised method is presented to address the problem of learning using small training datasets. The method first generates additional training samples using existing data and proposes a novel loss function based on cyclic consistency to regularize the training process. Results show that the proposed method can learn from weakly labelled data without pixel level correspondence information. In the next part of the thesis, we study the application of both global and local image matching to the problem of image retrieval. In the problem of particular landmark retrieval the thesis studies the role of contextual information in global query image representation which is generally ignored by existing approaches to remove noisy background information. An attention model is proposed that uses bottom-up saliency to modulate contextual information in intermediary CNN representations in a top-down manner. On the other hand, to address the challenges due to local variations in city-scale retrieval, the thesis proposes a geometric verification method using CNN based image matching. In addition, it proposes method for improving the accuracy and efficiency of the image matching method. Lastly, the thesis demonstrates methods utilizing the key concepts from image matching and image retrieval to address problems in the field of image based localization. In contrast to existing approaches the proposed method can be applied to novel scenes not seen during training and scales favourably with the size of the environment. In addition, a challenging indoor localization dataset is made publicly available to address limitation of existing datasets.
Supervising professorKannala, Juho, Prof., Aalto University, Department of Computer Science, Finland
computer vision, machine learning, deep learning, camera relocalization, image retrieval, image matching
[Publication 1]: Zakaria Laskar, and Juho Kannala. Semi-supervised Semantic Matching. European Conference on Computer Vision. Geometry Meets Deep Learning Workshop (ECCVW), pp. 444–455, 2018.
DOI: 10.1007/978-3-030-11015-4_32 View at publisher
[Publication 2]: Zakaria Laskar, Hamed Rezazadegan Tavakoli, and Juho Kannala. Semantic Matching by Weakly Supervised 2D Point Set Registration. Winter Conference on Applications of Computer Vision (WACV), pp. 1061– 1069, December 2019.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201907304541DOI: 10.1109/WACV.2019.00118 View at publisher
[Publication 3]: Zakaria Laskar, and Juho Kannala. Context Aware Query Image Representation for Particular Object Retrieval. Scandinivian Conference on Image Analysis (SCIA), pp. 88–99, January, 2017.
DOI: 10.1007/978-3-319-59129-2_8 View at publisher
- [Publication 4]: Zakaria Laskar, Iaroslav Melekhov, Hamed Rezazadegan Tavakoli, Juha Ylioinas, and Juho Kannala. Geometric Image Correspondence Veriﬁcation by Dense Pixel Matching. Winter Conference on Applications of Computer Vision (WACV), pp. 2510–2519, 2020
[Publication 5]: Zakaria Laskar, Iaroslav Melekhov, Surya Kalia, and Juho Kannala. Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Networks. International Conference on Computer Vision. Geometry Meets Deep Learning Workshop (ICCVW), pp. 929–938, 2017.
DOI: 10.1109/ICCVW.2017.113 View at publisher