Customers’ review mining for cell phones based on sentiment analysis

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

School of Business | Master's thesis

Date

2019

Major/Subject

Mcode

Degree programme

Information and Service Management (ISM)

Language

en

Pages

61+7

Series

Abstract

Natural Language Processing is the subfield of AI that helps computers understand human language. It is one of the significant applications of AI in business. The research on NLP started from the 1950s and broken down in recent decades, and it is believed to grow at a high pace in the future. The technology makes influential and substantial contributions to numerous fields such as machine translation and speech recognition, and it makes AI more available to human life. In this thesis, we will focus on one of NLP tasks, sentiment analysis, and attempt to use it to analyze and predict customers’ satisfaction for cell phones based on their textual reviews. Sentiment analysis is contextual mining of text, and it aims to detect and predict the viewpoint of customers for product and service. It is a technique of significant development and applied to all kinds of fields, such as political issues, marketing strategy, and media communication. In the past, traditional machine learning methods integrating with feature engineering are using in sentiment analysis. However, the emerging technique, deep learning, have achieved state-of-the-art in some NLP tasks and become the center of academic that replaces the traditional methods gradually. The new method also brings more insight and possibilities for further research in this field. In the thesis, we will implement both traditional statistical methods and neural network methods on customers’ textual reviews on cellphones, and carry out the sentiment analysis based on the sentence level. Subsequently, the performance of different models will be compared using quantitative metrics, and the optimal one will be selected. Also, we will discuss the details of the strengths and weaknesses of models, assess the reliability of the experiment and models in different aspects, for instance, the reasonableness of assumption and potential limitations. Our experiment is a classic multi-class problem based on an imbalanced dataset. We found that the deep learning models outperform most of traditional machine learning models in the experiment, and the workable models have similar performance. The MaxEntropy model with BoW embedding is considered as the best one given the fact that the model wins other models in multiple comparisons using different metrics. However, other models will win the optimal one if evaluated by some other metrics, and the winner will be replaced. We also discuss the limitations of the experiment and scrutinize further development.

Description

Thesis advisor

Malo, Pekka
Erästö, Panu

Keywords

text mining, sentiment analysis, natural language processing, machine learning

Other note

Citation