Event Discovery from Social Media Feeds

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2021-12-13

Department

Major/Subject

Computer Science

Mcode

SCI3042

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

vii + 56

Series

Abstract

Unexpected events occur from time to time, some might be planned with a pre- defined time and location, e.g., a concert, whereas some might be unplanned and happen suddenly, e.g., a strike. There can be a huge load in telecommunication networks due to such gatherings and the load might cause problems in network accessibility. To predict such events, tweets can be used as an information source. In this thesis, Twint was used to scrape tweets from Twitter. Different techniques were used to annotate the tweet data and a technique called Named Entity Recognition (NER) was used to train a classification model with limited amount of training data. Training a model from scratch can be both time and resource consuming so transfer learning was used with the spaCy library. A number of experiments were performed to find a model that correctly classifies concert as the type of an event, based on the context of the tweet. A key observation of the results presented is that the model trained with manual annotation of data shows better result compared to models trained with data that contains rule-based annotation and combination of rule-based and manual annotation. The analysis in this thesis shows that it is possible to use NER to extract events from social media feeds based on the context of the tweet. The thesis suggests that with domain expertise, the used approach to annotate data and to fine- tune a Transformer-based model can be utilized and extended to multiple NLP problems to create domain-specific applications.

Description

Supervisor

Laaksonen, Jorma

Thesis advisor

Sakko, Arto

Keywords

natural language processing, named entity recognition, event extraction, deep learning, transfer learning, tweets

Other note

Citation