Automatic Job Skill Taxonomy Generation For Recruitment Systems

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu | Master's thesis
Data Science
Degree programme
Master's Programme in ICT Innovation
The goal of this thesis is to optimize the job recommendation systems by automatically extracting the skills from the job descriptions. With rapid development in technology, new skills are continuously required. This makes the skill tagging of the job descriptions a more difficult problem since a simple keyword match from an already generated skill list is not suitable. A way of automatically populating the skills list to improve the job search engines is needed. This thesis focuses on solving this problem with the help of natural language processing and neural networks. Automatic detection of skills in the unstructured job description dataset is a complex problem as it involves being robust to the ambiguity of natural language and adapting to words not seen in the historical data. This thesis solves this problem by using recurrent neural network models for capturing the context of the skill words. Based on the context captured, the new system is capable of predicting if the word in the given text is a skill or not. Neural network models like Long short-term memory and Bi-directional Long short-term memory are used to capture the long term dependencies in the sentence to identify skills present in the job descriptions. Various natural language processing techniques were utilized to improve the input feature quality to the model. Results obtained from using context before and after the skill words have shown the best results in identifying skills from textual data. This can be applied to capture skills data from job ads as well as it can be extended to extract the skill features from resume data to improve the job recommendation results in the future.
Jung, Alexander
Thesis advisor
Auvinen, Tapio
sequence labeling, job recommendation systems, deep learning, recurrent neural network, context learning, knowledge discovery
Other note