Browser-based scene text detection and recognition on mobile devices

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorRowlinson, Andrew
dc.contributor.authorCucorova, Veronika
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.supervisorDi Francesco, Mario
dc.date.accessioned2021-10-24T17:11:13Z
dc.date.available2021-10-24T17:11:13Z
dc.date.issued2021-10-18
dc.description.abstractAutomatic detection and recognition of natural scene text is crucial to various applications such as navigation or object identification. Being able to do perform the task locally on a mobile phone enables offline functionality and increases user privacy amongst other benefits. This thesis presents TDR4W, a model tackling multi-oriented text detection and recognition in natural scenes that is designed for implementation in a progressive web application, allowing executing a model locally on a mobile device. TDR4W is based on the MobileNetV2 backbone. In contrast to many other commonly used multi-step solutions, the model unifies the prediction for detection and recognition and allows joined training. The design is almost as accurate as the previously used cloud-based solution with a difference of 1% in top-1 accuracy when tested on a dataset of labelled shipping container images. Moreover, it has less than half of the trainable parameters when compared to the previously used model, making its size much smaller. It only needs 3.9 billion floating-point operations to compute the prediction, which is not only less than the previously used cloud-based model but also less than a default segmentation model proposed by the authors of MobileNetV2, even though TDR4W works on images with bigger input size.en
dc.format.extent72
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/110599
dc.identifier.urnURN:NBN:fi:aalto-202110249777
dc.language.isoenen
dc.programmeMaster's Programme in ICT Innovationfi
dc.programme.majorData Science/Entrepreneurshipfi
dc.programme.mcodeSCI3095fi
dc.subject.keywordscene text detectionen
dc.subject.keywordscene text recognitionen
dc.subject.keywordmachine learning on mobile devicesen
dc.subject.keywordmachine learning in web browseren
dc.titleBrowser-based scene text detection and recognition on mobile devicesen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessno

Files