Object Detection in Finnish Movies

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2022-07-29

Department

Major/Subject

Machine Learning, Data Science and Artificial Intelligence

Mcode

SCI3044

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

45

Series

Abstract

Object detection is a computer vision technique to locate and identify objects in videos or images. A pre-labeled dataset is necessary in object detection for training the models, since various objects have different appearances, shapes and postures, coupled with the interference of factors such as illumination and occlusion during imaging. In general, it is impossible to predict what objects will appear in a movie before watching it. This thesis directly examines two distinct pre-trained deep learning-based object detection models utilizing convolutional neural networks to solve the problem. We use the Detectron2 and YOLOv3 models, they are trained on the COCO dataset, which contains the majority of objects shown in the movie. By extracting frames from the movie and recording the frames with indices incremented by time, it can be ensured that the two different models are executing on the same single frame so that their results can be compared. Due to the different performance of the two models on small detected objects in qualitative results, we perform more comparative experiments on the detected objects of varying sizes to improve detection accuracy. According to the results, Detectron2 was better in detecting small objects whereas YOLOv3 had higher precision with also lower detection score values. Additionally, we list all detected object classes with relatively high prediction scores and evaluate their correctness by sampling the extracted images. Finally, we suggest a prediction score threshold for object detection in further Finnish movies.

Description

Supervisor

Laaksonen, Jorma

Thesis advisor

Laaksonen, Jorma

Keywords

object detection, deep learning, computer vision, COCO dataset, movies

Other note

Citation