Community detection in complex networks: the role of node metadata

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorFortunato, Santo. Prof., Indiana University, USA
dc.contributor.advisorKivelä, Mikko, Assistant Prof., Aalto University, Department of Computer Science, Finland
dc.contributor.authorHric, Darko
dc.contributor.departmentTietotekniikan laitosfi
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.labComplex Systems Research Areaen
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorKaski, Kimmo, Prof., Aalto University, Department of Computer Science, Finland
dc.date.accessioned2017-10-13T09:02:53Z
dc.date.available2017-10-13T09:02:53Z
dc.date.defence2017-11-03
dc.date.issued2017
dc.description.abstractRecently, it was recognized that the problems lying between the order and chaos require a new scientific language and models to be developed. Network science has emerged as a promising interdisciplinary field studying the properties of all kinds of systems that emerge from interactions of large number of elements or constituents. A particularly interesting feature of complex networks is the presence of communities, or groups of nodes that have more connections between them than to the rest of the network. Communities provide an insight into the structure of the whole system and the immediate environment of each node, like circles of friends, or functionally related genes, and they have also been shown to play a role in various processes on networks. For these reasons numerous community detection algorithms have been proposed that take the network structure as input and return the communities, the nodes belong to. As the field of community detection matured, more scrutiny was applied to old and new algorithms. The researchers were not satisfied any more with good results on simple, almost toy examples, more proofs were sought for the applicability of the algorithms in the real world. At the same time, larger and more complex network datasets were becoming available, in which the need to identify meso-scale structures was even higher. A straightforward way to test the algorithms is to compare the results with the known node community assignments, which are taken to correspond to metadata labels on the nodes. In the first part of this dissertation a large number of algorithms were tested on a large number of labeled networks from different domains. Weak correspondences between metadata and communities indicate that more care has to be taken when using metadata as community labels. The relationship between the node metadata and communities is perhaps more complex than it was earlier assumed, but this does not mean that it is absent. Second part of this dissertation presents a novel approach for incorporating the metadata into community detection without assuming their usefulness. This approach enables to discriminate between metadata that are aligned with community structure and those that are not. The third part of this dissertation proposes the use of the stochastic blockmodel for modeling the citation networks of journals. The model is able to capture rich structures present in the data, while being simple, intuitive and applicable to huge networks (millions of nodes and links). By splitting the data spanning more that a hundred years into separate time windows, it was possible to track the evolution of science in time, and using the model presented in the previous part of the dissertation, the usefulness of journal classification into subject categories as predictors of the citation flows was evaluated.en
dc.format.extent88 + app. 82
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-7346-0 (electronic)
dc.identifier.isbn978-952-60-7347-7 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/28111
dc.identifier.urnURN:ISBN:978-952-60-7346-0
dc.language.isoenen
dc.opnMoreno, Yamir, Prof., University of Zaragoza, Spain
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Darko Hric, Richard K. Darst, Santo Fortunato. Community detection in networks: Structural communities versus ground truth. Physical Review E, Volume 90, Issue 6, pages 062805, December 2014. DOI: 10.1103/PhysRevE.90.062805
dc.relation.haspart[Publication 2]: Darko Hric, Tiago P. Peixoto, Santo Fortunato. Network Structure, Metadata, and the Prediction of Missing Nodes and Annotations. Physical Review X, Volume 6, Issue 3, pages 031038, September 2016. Fulltext at Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201610135072. DOI: 10.1103/PhysRevX.6.031038
dc.relation.haspart[Publication 3]: Darko Hric, Kimmo Kaski, Mikko Kivelä. Stochastic Block Model Reveals the Map of Citation Patterns and Their Evolution in Time, submitted for peer review, May 2017.
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries52/2017
dc.revRosvall, Martin, Associate Prof., Umeå University, Sweden
dc.revSinatra, Roberta, Assistant Prof., Central European University, Hungary
dc.subject.keywordcomplex networksen
dc.subject.keywordcommunity detectionen
dc.subject.keywordcitation networksen
dc.subject.otherComputer scienceen
dc.titleCommunity detection in complex networks: the role of node metadataen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.archiveyes
local.aalto.formfolder2017_10_13_klo_09_32
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
isbn9789526073460.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format