A machine learning framework for public transport ridership estimation using multi-source data fusion with low-cost bluetooth data
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Engineering |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-12-31
Department
Major/Subject
Spatial Planning and Transportation Engineering
Mcode
Degree programme
Master's Programme in Spatial Planning and Transportation Engineering
Language
en
Pages
65
Series
Abstract
Accurate information about demand volumes at certain locations within public transport networks is critical to making informed decision by transportation planners. Traditional manual counts to collect volumes, while accurate, are costly and labour intensive. Existing automatic passenger counting systems also face limitations in terms of cost, accuracy, or compatibility. This paper proposes a multi-source data fusion framework to improve the ca-pability of passenger counting using a low-cost Bluetooth sensor. The frameworks combine otherwise independent and unrelated raw Bluetooth counts, novel drone data and freely available General Transit Feed Specification - Real-Time and ferry schedule data into a unified and comparable format. The proposed framework leverages the advance capabilities of various machine learning models, K-Nearest Neighbour, XGBoost and Random Forest to estimate ridership of public transport vehicles in an area affected by nearby ferry operations. The results demonstrate the models generated using the framework achieve high accuracy and low errors when compared to the ground truth of manual counts. Machine learning model vastly outperform standard Linear Regression model with a R2 value of 0.86 compared to 0.62. Models incorporating variables develop from the framework significantly outperform those that rely solely on Bluetooth data (R2 of 0.86 vs -0.49). Notably, the framework is still able to draw similar conclusion when utilizing the drone counts as the ground truth which expose the model with significantly more data points then manual counts. However, discrepancy between manual count and drone count highlight the need for further validation to enhance the reliability of this approach. Nevertheless, the framework highlights the value of multi-sensor data fusion as a necessary enhancement to improve the utility and accuracy of the low-cost Bluetooth count.Description
Supervisor
Roncoli, ClaudioThesis advisor
Sipetas, CharalamposKeywords
data fusion, machine learning, public transport, bluetooth data, drone data, automated passenger counting