A machine learning framework for public transport ridership estimation using multi-source data fusion with low-cost bluetooth data

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorSipetas, Charalampos
dc.contributor.authorYung, Chris
dc.contributor.schoolInsinööritieteiden korkeakoulufi
dc.contributor.schoolSchool of Engineeringen
dc.contributor.supervisorRoncoli, Claudio
dc.date.accessioned2025-01-24T18:02:34Z
dc.date.available2025-01-24T18:02:34Z
dc.date.issued2024-12-31
dc.description.abstractAccurate information about demand volumes at certain locations within public transport networks is critical to making informed decision by transportation planners. Traditional manual counts to collect volumes, while accurate, are costly and labour intensive. Existing automatic passenger counting systems also face limitations in terms of cost, accuracy, or compatibility. This paper proposes a multi-source data fusion framework to improve the ca-pability of passenger counting using a low-cost Bluetooth sensor. The frameworks combine otherwise independent and unrelated raw Bluetooth counts, novel drone data and freely available General Transit Feed Specification - Real-Time and ferry schedule data into a unified and comparable format. The proposed framework leverages the advance capabilities of various machine learning models, K-Nearest Neighbour, XGBoost and Random Forest to estimate ridership of public transport vehicles in an area affected by nearby ferry operations. The results demonstrate the models generated using the framework achieve high accuracy and low errors when compared to the ground truth of manual counts. Machine learning model vastly outperform standard Linear Regression model with a R2 value of 0.86 compared to 0.62. Models incorporating variables develop from the framework significantly outperform those that rely solely on Bluetooth data (R2 of 0.86 vs -0.49). Notably, the framework is still able to draw similar conclusion when utilizing the drone counts as the ground truth which expose the model with significantly more data points then manual counts. However, discrepancy between manual count and drone count highlight the need for further validation to enhance the reliability of this approach. Nevertheless, the framework highlights the value of multi-sensor data fusion as a necessary enhancement to improve the utility and accuracy of the low-cost Bluetooth count.en
dc.format.extent65
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/133389
dc.identifier.urnURN:NBN:fi:aalto-202501241673
dc.language.isoenen
dc.programmeMaster's Programme in Spatial Planning and Transportation Engineeringen
dc.programme.majorSpatial Planning and Transportation Engineering
dc.subject.keyworddata fusionen
dc.subject.keywordmachine learningen
dc.subject.keywordpublic transporten
dc.subject.keywordbluetooth dataen
dc.subject.keyworddrone dataen
dc.subject.keywordautomated passenger countingen
dc.titleA machine learning framework for public transport ridership estimation using multi-source data fusion with low-cost bluetooth dataen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessyes

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
master_Yung_Chris_2025.pdf
Size:
2.52 MB
Format:
Adobe Portable Document Format