Finding similar neighborhoods across cities by mining human urban activity
Perustieteiden korkeakoulu | Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Machine Learning and Data Mining
Master’s Programme in Machine Learning and Data Mining (Macadamia)
AbstractWe propose a method to match similar neighborhoods across different cities. That is, we give ourselves a measure of similarity between urban regions, as well as one region in one city. Our goal is then to find the region in some other cities which minimize the distance with the query region. Furthermore, we seek to do it efficiently, as it is prohibitive to evaluate the distance of all possible candidate regions. First, we collect trace of activities in 20 European and American cities from location aware social platforms Foursquare and Flickr. A thorough exploration of this dataset leads us to describe individual venues by relevant features including their aggregate activity across time, their visitors and overall popularity, and the typology of their surrounding. Then we learned several measures of venue similarity in a semi-supervised setting and evaluate their performance on two information retrieval tasks. After gathering human ground truth about neighborhoods, we evaluate different metrics between sets of venues and find out that Earth Mover’s Distance is best suited at assessing neighborhood similarity. Finally, we address the computational efficiency problem of finding the most similar neighborhood given a query. We devise a heuristic search strategy and show that it provides results of comparable quality while being orders of magnitude faster. This work has application in touristic recommendation and urban planning, as it provides a similarity measure between urban areas.
Thesis advisorMathioudakis, Michael
smart cities, metric learning, clustering, geolocation, neighborhood, urban computing