Citation:
Rozenshtein , P , Tatti , N & Gionis , A 2017 , The Network-Untangling Problem: From Interactions to Activity Timelines . in M Ceci , J Hollmén , L Todorovski , C Vens & S Džeroski (eds) , Machine Learning and Knowledge Discovery in Databases : European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part I . vol. 1 , Lecture Notes in Computer Science , vol. 10534 , Springer , Cham , pp. 701-716 , European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases , Skopje , Macedonia, The Former Yugoslav Republic of , 18/09/2017 . https://doi.org/10.1007/978-3-319-71249-9_42
|
Abstract:
In this paper we study a problem of determining when entities are active based on their interactions with each other. More formally, we consider a set of entities V and a sequence of time-stamped edges E among the entities. Each edge (u,v,t)∈E denotes an interaction between entities u and v that takes place at time t. We view this input as a temporal network. We then assume a simple activity model in which each entity is active during a short time interval. An interaction (u, v, t) can be explained if at least one of u or v are active at time t. Our goal is to reconstruct the activity intervals, for all entities in the network, so as to explain the observed interactions. This problem, which we refer to as the network-untangling problem, can be applied to discover timelines of events from complex interactions among entities. We provide two formulations for the network-untangling problem: (i) minimizing the total interval length over all entities, and (ii) minimizing the maximum interval length. We show thatthe sum problem is NP-hard, while, surprisingly, the max problem can be solved optimally in linear time, using a mapping to 2-SAT. For the sum problem we provide efficient and effective algorithms based on realistic assumptions. Furthermore, we complement our study with an evaluation on synthetic and real-world datasets, which demonstrates the validity of our concepts and the good performance of our algorithms.
|