Sampling networks by nodal attributes

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorMurase, Yohsuke
dc.contributor.authorJo, Hang Hyun
dc.contributor.authorTörök, János
dc.contributor.authorKertész, János
dc.contributor.authorKaski, Kimmo
dc.contributor.departmentRIKEN
dc.contributor.departmentDepartment of Computer Science
dc.contributor.departmentBudapest University of Technology and Economics
dc.contributor.departmentKaski Kimmo group
dc.date.accessioned2019-06-20T13:16:49Z
dc.date.available2019-06-20T13:16:49Z
dc.date.issued2019-05-15
dc.description| openaire: EC/H2020/654024/EU//SoBigData
dc.description.abstractIn a social network individuals or nodes connect to other nodes by choosing one of the channels of communication at a time to re-establish the existing social links. Since available data sets are usually restricted to a limited number of channels or layers, these autonomous decision making processes by the nodes constitute the sampling of a multiplex network leading to just one (though very important) example of sampling bias caused by the behavior of the nodes. We develop a general setting to get insight and understand the class of network sampling models, where the probability of sampling a link in the original network depends on the attributes h of its adjacent nodes. Assuming that the nodal attributes are independently drawn from an arbitrary distribution ρ(h) and that the sampling probability r(hi,hj) for a link ij of nodal attributes hi and hj is also arbitrary, we derive exact analytic expressions of the sampled network for such network characteristics as the degree distribution, degree correlation, and clustering spectrum. The properties of the sampled network turn out to be sums of quantities for the original network topology weighted by the factors stemming from the sampling. Based on our analysis, we find that the sampled network may have sampling-induced network properties that are absent in the original network, which implies the potential risk of a naive generalization of the results of the sample to the entire original network. We also consider the case, when neighboring nodes have correlated attributes to show how to generalize our formalism for such sampling bias and we get good agreement between the analytic results and the numerical simulations.en
dc.description.versionPeer revieweden
dc.format.extent1-10
dc.format.mimetypeapplication/pdf
dc.identifier.citationMurase , Y , Jo , H H , Török , J , Kertész , J & Kaski , K 2019 , ' Sampling networks by nodal attributes ' , Physical Review E , vol. 99 , no. 5 , 052304 , pp. 1-10 . https://doi.org/10.1103/PhysRevE.99.052304en
dc.identifier.doi10.1103/PhysRevE.99.052304
dc.identifier.issn2470-0045
dc.identifier.issn2470-0053
dc.identifier.otherPURE UUID: e1e1058b-7edb-40c7-9f10-fb61e182abed
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/e1e1058b-7edb-40c7-9f10-fb61e182abed
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85065821880&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/34202783/PhysRevE.99.052304.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/38880
dc.identifier.urnURN:NBN:fi:aalto-201906203946
dc.language.isoenen
dc.publisherAmerican Physical Society
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/654024/EU//SoBigData
dc.relation.ispartofseriesPhysical Review Een
dc.relation.ispartofseriesVolume 99, issue 5en
dc.rightsopenAccessen
dc.titleSampling networks by nodal attributesen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files