A Visual Indoor Localisation System
No Thumbnail Available
Perustieteiden korkeakoulu | Master's thesis
Machine Learning, Data Science and Artificial Intelligence
Master’s Programme in Computer, Communication and Information Sciences
AbstractVisual indoor localisation, as an application of computer vision, tries to retrieve the precise 6-degree-of-freedom (DoF) pose of the camera from a given image. It has increasingly gained importance in indoor applications that require higher accuracy, such as vision-based navigation, augmented reality (AR), and virtual reality (VR). To achieve the goal of high-accuracy and real-time positioning, two types of methods are widely considered -- image-based approaches and 3D-structure-based approaches. Some image-based methods retrieve geo-tagged database images similar to a query image and estimate the position by geo-tags. Retrieval-based methods are typically fast but usually considered inaccurate. On the other hand, a typical 3D-structure-based method extracts key points in a query image and matches them to key points in a pre-constructed 3D-structure model with known coordinates. Then, a RANSAC algorithm will be applied to estimate a pose that has the largest number of inliers. The two methods have divergent advantages: the retrieval-based method is faster, while the 3D-structure-based method can outperform it in terms of accuracy. In this thesis, a combination of image-based methods and 3D-structure-based methods is investigated. Specifically, an end-to-end localization pipeline is proposed and tested, including offline structure-from-motion, key point-retrieval, and an online PnP solver. The method is evaluated on datasets collected in various scenarios.
Thesis advisorKannala, Juho
computer vision, visual localisation, structure-from-motion, RANSAC