Learning Centre

Data science for social good - Theory and applications in epidemics, polarization, and fair clustering

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Xiao, Han
dc.date.accessioned 2020-08-25T09:00:05Z
dc.date.available 2020-08-25T09:00:05Z
dc.date.issued 2020
dc.identifier.isbn 978-952-60-3990-9 (electronic)
dc.identifier.isbn 978-952-60-3989-3 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/46243
dc.description.abstract Technical innovations have transformed our lives fundamentally, in both positive and negative ways. In this thesis, we look at the negative side. We identify three problems to tackle, namely epidemics, online polarization, and bias in automatic decision-making processes, and approach them using data-driven approaches. Thanks to globalization, our world is more interconnected than before. While trade and exchange of ideas are happening at an unprecedented rate, the rapid spread of disease is happening globally, as evidenced by the pandemic of COVID-19. To contain epidemics effectively, it is crucial to identify as many infected persons as possible. In practice, however, it is almost impossible to obtain the complete information of who is infected. We study this challenge in the context of social networks, where a disease spreads via network edges. Specifically, we assume only a subset of all infections is observed and we seek to infer who else is infected. Furthermore, we consider two different settings: (1) temporal setting, in which infection time is also observed and, (2) probabilistic setting, in which infection probability of each individual is produced.Social-media platforms enable people to share and access information easily. Meanwhile, flawed designs in these platforms contribute to the formation of online polarization. As a result, people are unlikely to adopt new ideas that differ from their beliefs, which finally leads to a polarized society. To tackle online polarization, we argue that it is important to discover who is involved in the polarization. We consider a problem setting under social networks, in which the interaction between two persons is either friendly or antagonistic. Furthermore, given some seed nodes that represent different sides of a polarized subgraph, we seek to find the polarized subgraph that is relevant to the seeds. Finding such structures can be used to understand the nature of polarization, and to mitigate the degree of polarization. Machine-learning algorithms allow the automation of many decision-making processes, for example, deciding whether to grant a loan to a loan applicant. However, unfair results that favor one demographic group (e.g., male) over another (e.g., female) are witnessed. The unfair outcomes may further affect the well-being of the mistreated groups. In this thesis, we focus on the task of data clustering, which has applications in infrastructure design and online social media. We discuss potential fairness issues in existing clustering algorithms that are designed to be fair. As a result, we propose a new fair clustering formulation that captures a novel fairness notion. For all proposed problems, we study their complexity and design algorithms whose theoretical performance is analyzed. We evaluate all proposed algorithms' efficacy in both synthetic and real-world settings. en
dc.format.extent 87 + app. 55
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 118/2020
dc.relation.haspart [Publication 1]: Xiao, Han; Rozenshtein, Polina; Tatti, Nikolaj; Gionis, Aristides. Reconstructing a cascade from temporal observations. In Proceedings of the 2018 SIAM International Conference on Data Mining, pages 666–674, May 2018. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201902251793. DOI: 10.1137/1.9781611975321.75
dc.relation.haspart [Publication 2]: Xiao, Han; Aslay, Çigdem; Gionis, Aristides. Robust cascade reconstruction by Steiner tree sampling. In 2018 IEEE International Conference on Data Mining, pages 637–646, November 2018. Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201901141105. DOI: 10.1109/ICDM.2018.00079
dc.relation.haspart [Publication 3]: Xiao, Han; Ordozgoiti, Bruno; Gionis, Aristides. Searching for polarization in signed graphs: a local spectral approach. In The World Wide Web Conference, pages 362–372, April 2020. DOI: 10.1145/3366423.3380121
dc.relation.haspart [Publication 4]: Xiao, Han; Ordozgoiti, Bruno; Gionis, Aristides. A distance-based approach to fair clustering. Submitted for publication, July 2020
dc.subject.other Computer science en
dc.title Data science for social good - Theory and applications in epidemics, polarization, and fair clustering en
dc.type G5 Artikkeliväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietotekniikan laitos fi
dc.contributor.department Department of Computer Science en
dc.subject.keyword data mining en
dc.subject.keyword graph mining en
dc.subject.keyword social network analysis en
dc.subject.keyword epidemics en
dc.subject.keyword fairness en
dc.subject.keyword online polarization en
dc.subject.keyword algorithm design en
dc.subject.keyword approximation algorithm en
dc.identifier.urn URN:ISBN:978-952-60-3990-9
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Gionis, Aristides, Adj. Prof., Aalto University, Department of Computer Science, Finland
dc.opn Koutra, Danai, Asst. Prof., University of Michigan, USA
dc.contributor.lab Data Mining group en
dc.rev Tong, Hanghang, Assoc. Prof., University of Illinois, USA
dc.rev Orecchia, Lorenzo, Asst. Prof., University of Chicago, USA
dc.date.defence 2020-09-25
local.aalto.acrisexportstatus checked 2020-10-19_1211
local.aalto.infra Science-IT
local.aalto.formfolder 2020_08_25_klo_11_46
local.aalto.archive yes

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication