A machine learning framework for public transport ridership estimation using multi-source data fusion with low-cost bluetooth data

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Engineering | Master's thesis

Date

2024-12-31

Department

Major/Subject

Spatial Planning and Transportation Engineering

Mcode

Degree programme

Master's Programme in Spatial Planning and Transportation Engineering

Language

en

Pages

65

Series

Abstract

Accurate information about demand volumes at certain locations within public transport networks is critical to making informed decision by transportation planners. Traditional manual counts to collect volumes, while accurate, are costly and labour intensive. Existing automatic passenger counting systems also face limitations in terms of cost, accuracy, or compatibility. This paper proposes a multi-source data fusion framework to improve the ca-pability of passenger counting using a low-cost Bluetooth sensor. The frameworks combine otherwise independent and unrelated raw Bluetooth counts, novel drone data and freely available General Transit Feed Specification - Real-Time and ferry schedule data into a unified and comparable format. The proposed framework leverages the advance capabilities of various machine learning models, K-Nearest Neighbour, XGBoost and Random Forest to estimate ridership of public transport vehicles in an area affected by nearby ferry operations. The results demonstrate the models generated using the framework achieve high accuracy and low errors when compared to the ground truth of manual counts. Machine learning model vastly outperform standard Linear Regression model with a R2 value of 0.86 compared to 0.62. Models incorporating variables develop from the framework significantly outperform those that rely solely on Bluetooth data (R2 of 0.86 vs -0.49). Notably, the framework is still able to draw similar conclusion when utilizing the drone counts as the ground truth which expose the model with significantly more data points then manual counts. However, discrepancy between manual count and drone count highlight the need for further validation to enhance the reliability of this approach. Nevertheless, the framework highlights the value of multi-sensor data fusion as a necessary enhancement to improve the utility and accuracy of the low-cost Bluetooth count.

Description

Supervisor

Roncoli, Claudio

Thesis advisor

Sipetas, Charalampos

Keywords

data fusion, machine learning, public transport, bluetooth data, drone data, automated passenger counting

Other note

Citation