Customers’ review mining for cell phones based on sentiment analysis
No Thumbnail Available
URL
Journal Title
Journal ISSN
Volume Title
School of Business |
Master's thesis
Authors
Date
2019
Department
Major/Subject
Mcode
Degree programme
Information and Service Management (ISM)
Language
en
Pages
61+7
Series
Abstract
Natural Language Processing is the subfield of AI that helps computers understand human language. It is one of the significant applications of AI in business. The research on NLP started from the 1950s and broken down in recent decades, and it is believed to grow at a high pace in the future. The technology makes influential and substantial contributions to numerous fields such as machine translation and speech recognition, and it makes AI more available to human life. In this thesis, we will focus on one of NLP tasks, sentiment analysis, and attempt to use it to analyze and predict customers’ satisfaction for cell phones based on their textual reviews. Sentiment analysis is contextual mining of text, and it aims to detect and predict the viewpoint of customers for product and service. It is a technique of significant development and applied to all kinds of fields, such as political issues, marketing strategy, and media communication. In the past, traditional machine learning methods integrating with feature engineering are using in sentiment analysis. However, the emerging technique, deep learning, have achieved state-of-the-art in some NLP tasks and become the center of academic that replaces the traditional methods gradually. The new method also brings more insight and possibilities for further research in this field. In the thesis, we will implement both traditional statistical methods and neural network methods on customers’ textual reviews on cellphones, and carry out the sentiment analysis based on the sentence level. Subsequently, the performance of different models will be compared using quantitative metrics, and the optimal one will be selected. Also, we will discuss the details of the strengths and weaknesses of models, assess the reliability of the experiment and models in different aspects, for instance, the reasonableness of assumption and potential limitations. Our experiment is a classic multi-class problem based on an imbalanced dataset. We found that the deep learning models outperform most of traditional machine learning models in the experiment, and the workable models have similar performance. The MaxEntropy model with BoW embedding is considered as the best one given the fact that the model wins other models in multiple comparisons using different metrics. However, other models will win the optimal one if evaluated by some other metrics, and the winner will be replaced. We also discuss the limitations of the experiment and scrutinize further development.Description
Thesis advisor
Malo, PekkaErästö, Panu
Keywords
text mining, sentiment analysis, natural language processing, machine learning