Attention-based RNN to server anomaly detection

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Authors

Department

Major/Subject

Mcode

SCI3095

Language

en

Pages

61

Series

Abstract

This thesis evaluates several machine learning classification algorithms and Attention-based LSTM to classify root causes of the failed tests. It also proves that the Attention-based LSTM will improve the performance with the help of the unlabeled data by using pseudo label method. The development team and the testing environment at Nokia provides the testing logs as the data source of this thesis. The error messages are extracted from the logs and represented as vectors by using word2vec. The vectors are the feature of the data to feed into the models. The thesis first tries six traditional machine learning algorithms, such as SVM, Random Forest, Extremely Randomized Trees, AdaBoost, Gradi- ent Boosting, and XGBoost with labeled data. It is then using the best model to predict the labels of unlabeled data to be the pseudo labels. And Attention- based LSTM model is trained with labeled data and pseudo labeled data. As a conclusion, Extremely Randomized Trees have the best results among all the models. Attention-based LSTM did not show better performance than traditional machine learning algorithms, but it had better performance with the help of un- labeled data. The proposed methodology contributes a solution towards making use of unlabeled data in the situation that the size of labeled data is small. And it also provides a baseline approach towards classifying root causes of these specific test data.

Description

Supervisor

Babbar, Rohit

Thesis advisor

Lv, Zhengwu

Other note

Citation