Object Detection in Finnish Movies
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2022-07-29
Department
Major/Subject
Machine Learning, Data Science and Artificial Intelligence
Mcode
SCI3044
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
45
Series
Abstract
Object detection is a computer vision technique to locate and identify objects in videos or images. A pre-labeled dataset is necessary in object detection for training the models, since various objects have different appearances, shapes and postures, coupled with the interference of factors such as illumination and occlusion during imaging. In general, it is impossible to predict what objects will appear in a movie before watching it. This thesis directly examines two distinct pre-trained deep learning-based object detection models utilizing convolutional neural networks to solve the problem. We use the Detectron2 and YOLOv3 models, they are trained on the COCO dataset, which contains the majority of objects shown in the movie. By extracting frames from the movie and recording the frames with indices incremented by time, it can be ensured that the two different models are executing on the same single frame so that their results can be compared. Due to the different performance of the two models on small detected objects in qualitative results, we perform more comparative experiments on the detected objects of varying sizes to improve detection accuracy. According to the results, Detectron2 was better in detecting small objects whereas YOLOv3 had higher precision with also lower detection score values. Additionally, we list all detected object classes with relatively high prediction scores and evaluate their correctness by sampling the extracted images. Finally, we suggest a prediction score threshold for object detection in further Finnish movies.Description
Supervisor
Laaksonen, JormaThesis advisor
Laaksonen, JormaKeywords
object detection, deep learning, computer vision, COCO dataset, movies