Semantic Content Filtering and Sentiment Analysis for Financial News

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Wallenius, Jyrki, Prof., Aalto University, Finland
dc.contributor.advisor Korhonen, Pekka, Prof., Aalto University, Finland Ahlgren, Oskar 2017-02-17T10:00:31Z 2017-02-17T10:00:31Z 2016
dc.identifier.isbn 978-952-60-7096-4 (electronic)
dc.identifier.isbn 978-952-60-7097-1 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.description.abstract Today we seldom suffer from lack of information; on the contrary, we often suffer from too much information. As a consequence, important information might go unnoticed, which of course is harmful for individuals, companies, and the economy as a whole. To alleviate the current situation, tools for analyzing financial news are developed in this dissertation. This thesis consists of an introductory part and six research essays. These essays cover three different aspects of these matters. The first two essays cover the data mining and document filtering aspects. In Essay 1, the Wiki-SR method is presented. This approach uses Wikipedia to calculate the relatedness between two concepts, which enhances search queries by implicitly expanding them. This essay also introduces a framework that allows for multiple models in order to improve document modeling. Essay 2 presents a modified Wilks' lambda technique for finding the concepts that best describe a specific document. Even if the proposed approach is light-weight, it is still very efficient. The second group of essays focuses on sentiment analysis. Essay 3 presents an approach that parses sentences and detects any words that might change the polarity of a sentiment-bearing word. This approach shows a significant improvement in accuracy of the analysis. The result was verified with our manually annotated sentiment corpus. A more advanced sentiment corpus was published in Essay 4. This new dual-layer corpus is annotated on both the document and sentence level. As it also allows multiple sentiment-bearing entities in the same sentence, more advanced techniques can be developed. Both corpora are publicly available, and they alleviate the current lack of method evaluation sets in the financial domain. The last two essays put this research in context. Essay 5 studies the research done in the field of sentiment analysis over the last decade. When the keywords given by authors and publishers are compared and the wording of titles and abstracts is analyzed, there are four distinctive areas of interest. Two of them are related to techniques used for sentiment analysis (sentiment classification and sentiment lexicon), and two are common domains of the analysis (reviews and social media). Essay 6 describes the steps needed for a computational approach to financial news analysis as well as commonly used tools and resources. en
dc.format.extent 135
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 221/2016
dc.relation.haspart [Publication 1]: Pekka Malo, Pyry Siitari, Oskar Ahlgren, Jyrki Wallenius, and Pekka Korhonen. Semantic Content Filtering with Wikipedia and Ontologies. Proceedings of the IEEE International Conference on Data Mining Workshops (SADM 2010), December 2010, Sydney, Australia.
dc.relation.haspart [Publication 2]: Oskar Ahlgren, Pekka Malo, Ankur Sinha, Jyrki Wallenius, and Pekka Korhonen. A Dimensionality Reduction Approach to Semantic Document Classification. Proceedings of the 2nd Workshop on Semantic Personalized Information Management: Retrieval and Recommendation (SPIM 2011) in conjunction with the 10th International Semantic Web Conference (ISWC 2011), October 2011, Bonn, Germany
dc.relation.haspart [Publication 3]: Pekka Malo, Ankur Sinha, Pyry Siitari, Oskar Ahlgren, and Iivari Lappalainen. Learning the Roles of Directional Expressions and Domain Concepts in Financial News Analysis. Proceedings of the IEEE International Conference on Data Mining Workshops (SENTIRE 2013), December 2013, Dallas, U.S.A.
dc.relation.haspart [Publication 4]: Pyry Takala, Pekka Malo, Ankur Sinha, and Oskar Ahlgren. Gold-standard for Topic-Specific Sentiments in Economic Texts. Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), May 2014, Reykjavik, Iceland
dc.relation.haspart [Publication 5]: Oskar Ahlgren, Research On Sentiment Analysis: The First Decade. Forthcoming. DOI: 10.1109/ICDMW.2016.0131
dc.relation.haspart [Publication 6]: Oskar Ahlgren, Bikesh Upreti, Pekka Malo, and Ankur Sinha, Knowledge-driven Approaches for Financial News Analysis. Unpublished.
dc.subject.other Economics en
dc.title Semantic Content Filtering and Sentiment Analysis for Financial News en
dc.type G5 Artikkeliväitöskirja fi Kauppakorkeakoulu fi School of Business en
dc.contributor.department Tieto- ja palvelutalouden laitos fi
dc.contributor.department Department of Information and Service Economy en
dc.subject.keyword data mining en
dc.subject.keyword document filtering en
dc.subject.keyword text analysis en
dc.subject.keyword sentiment detection en
dc.subject.keyword sentiment corpora en
dc.identifier.urn URN:ISBN:978-952-60-7096-4
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Malo, Pekka, Assoc. Prof., Aalto University, Department of Information and Service Economy, Finland
dc.opn Michalowski, Wojtek, Prof., Telfer School of Management, Canada
dc.subject.helecon tietämyksenhallinta
dc.subject.helecon uutiset
dc.subject.helecon viestintä
dc.subject.helecon tiedonlouhinta

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account