Finding similar neighborhoods across cities by mining human urban activity

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu | Master's thesis
Date
2014-08-21
Department
Major/Subject
Machine Learning and Data Mining
Mcode
SCI3015
Degree programme
Master’s Programme in Machine Learning and Data Mining (Macadamia)
Language
en
Pages
52+6
Series
Abstract
We propose a method to match similar neighborhoods across different cities. That is, we give ourselves a measure of similarity between urban regions, as well as one region in one city. Our goal is then to find the region in some other cities which minimize the distance with the query region. Furthermore, we seek to do it efficiently, as it is prohibitive to evaluate the distance of all possible candidate regions. First, we collect trace of activities in 20 European and American cities from location aware social platforms Foursquare and Flickr. A thorough exploration of this dataset leads us to describe individual venues by relevant features including their aggregate activity across time, their visitors and overall popularity, and the typology of their surrounding. Then we learned several measures of venue similarity in a semi-supervised setting and evaluate their performance on two information retrieval tasks. After gathering human ground truth about neighborhoods, we evaluate different metrics between sets of venues and find out that Earth Mover’s Distance is best suited at assessing neighborhood similarity. Finally, we address the computational efficiency problem of finding the most similar neighborhood given a query. We devise a heuristic search strategy and show that it provides results of comparable quality while being orders of magnitude faster. This work has application in touristic recommendation and urban planning, as it provides a similarity measure between urban areas.
Description
Supervisor
Gionis, Aristides
Thesis advisor
Mathioudakis, Michael
Keywords
smart cities, metric learning, clustering, geolocation, neighborhood, urban computing
Other note
Citation