Untangling the Application of Text-mining Methods in Information Systems Domain
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Business |
Doctoral thesis (article-based)
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2019
Major/Subject
Mcode
Degree programme
Language
en
Pages
63 + app. 115
Series
Aalto University publication series DOCTORAL DISSERTATIONS, 89/2019
Abstract
The advent of digitalization has brought a massive proliferation of unstructured data, producing vast repositories of textual data, from various sources, such as Web sites, academic publications, news articles, blog posts, e-mail, corporate communication platforms, reports, and social media feeds. This proliferation coupled with the upsurge in mobile and Web technologies alongside ever-improving connectivity has led to various digital platforms and applications rapidly achieving mass-market penetration. With the production of textual and other forms of unstructured data certain to continue at unprecedented rates for the foreseeable future, this availability on massive scale presents both opportunities and challenges that researchers and practitioners must address. Ability to utilize text data on a large scale not only provides better coverage in terms of sample size but also opens opportunities to build a deeper understanding of phenomena that otherwise are simply unobservable, "hidden in the noise.'' However, as the world races towards high-volume production, distribution, and consumption of digital text, information systems (IS) researchers are proving slow to start reaping the potential of analyzing textual data. There is an urgent need for methods and techniques that can meet the challenge of analyzing vast bodies of textual data. In an effort to demonstrate potential application of text-mining methods in information systems research, the dissertation presents essays that address large-scale text-based datasets' use in literature analysis and studies of system-specific behavioral outcomes. The first essay deals with identifying the research themes presented in a large body of publications on cloud computing, and the second essay demonstrates the machine-based classification of papers in leading information-systems journals. Of the behavior-focused pieces, the third essay utilizes user-generated content to illustrate system-driven viewing outcomes in the context of binge watching of television shows, and the final essay examines a large volume of content connected with a business-to-business Web portal, reporting on a study of browsing-device-linked differences in interest in marketing material. In addition to the individual essays, the dissertation contributes to the scholarly discussion of text-mining research issues in three important ways. Firstly, it presents a conceptual framework that aids in revealing the fundamentals of text-mining research in terms of two dimensions: research objective and level of text analysis. Secondly, the four essays provide concrete demonstrations of various suitable applications of text-mining. Finally, the dissertation examines the implications of the work, highlighting specific issues and challenges pertaining to text-mining research. The findings and implications of this work should benefit IS researchers and practitioners striving to exploit large volume of textual data.Description
Supervising professor
Malo, Pekka, Assoc. Prof., Aalto University, Department of Information and Service Economy, FinlandThesis advisor
Rossi, Matti, Prof., Aalto University, Department of Information and Service Economy, FinlandKeywords
text mining, information systems, systematic review, topic models, text classification, word embedding, system-driven behaviour, social media
Other note
Parts
-
[Publication 1]: Upreti, Bikesh; Asatiani, Aleksandre; Malo, Pekka. To reach the clouds: Application of topic models to the meta-review on cloud computing literature. In 49th Hawaii International Conference on System Sciences, 3979–3988, January 2016.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201510314815DOI: 10.1109/HICSS.2016.493 View at publisher
- [Publication 2]: Salovaara, Antti; Upreti, Bikesh Raj; Nykänen, Jussi; Merikivi, Jani. Building theories on shaky foundations? The lack of falsification and correction of knowledge in IS research. Submitted to European Journal of Information Systems, 2019.
- [Publication 3]: Upreti, Bikesh Raj; Merikivi, Jani; Bragge, Johanna; Malo, Pekka. Analyzing the ways IT has changed our TV consumption: Binge watching and marathon watching. In International Conference on Information Systems, Seoul, South Korea, December 2017.
- [Publication 4]: Upreti, Bikesh. System-driven browsing outcomes in consuming B2B content: A case of B2B content marketing. Unpublished manuscript.
- Errata