Browsing by Author "Bonchi, Francesco"
Now showing 1 - 6 of 6
- Results Per Page
- Sort Options
- Finding events in temporal networks: segmentation meets densest subgraph discovery
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä(2019-01-01) Rozenshtein, Polina; Bonchi, Francesco; Gionis, Aristides; Sozio, Mauro; Tatti, NikolajIn this paper, we study the problem of discovering a timeline of events in a temporal network. We model events as dense subgraphs that occur within intervals of network activity. We formulate the event discovery task as an optimization problem, where we search for a partition of the network timeline into k non-overlapping intervals, such that the intervals span subgraphs with maximum total density. The output is a sequence of dense subgraphs along with corresponding time intervals, capturing the most interesting events during the network lifetime. A naïve solution to our optimization problem has polynomial but prohibitively high running time. We adapt existing recent work on dynamic densest subgraph discovery and approximate dynamic programming to design a fast approximation algorithm. Next, to ensure richer structure, we adjust the problem formulation to encourage coverage of a larger set of nodes. This problem is NP-hard; however, we show that on static graphs a simple greedy algorithm leads to approximate solution due to submodularity. We extend this greedy approach for temporal networks, but we lose the approximation guarantee in the process. Finally, we demonstrate empirically that our algorithms recover solutions with good quality. - Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä(2020-04-01) Nanni, Mirco; Andrienko, Gennady; Barabàsi, Albert Làszlò; Boldrini, Chiara; Bonchi, Francesco; Cattuto, Ciro; Chiaromonte, Francesca; Comandé, Giovanni; Conti, Marco; Coté, Mark; Dignum, Frank; Dignum, Virginia; Domingo-Ferrer, Josep; Ferragina, Paolo; Giannotti, Fosca; Guidotti, Riccardo; Helbing, Dirk; Kaski, Kimmo; Kertesz, Janos; Lehmann, Sune; Lepri, Bruno; Lukowicz, Paul; Matwin, Stan; Jiménez, Davidmegìas; Monreale, Anna; Morik, Katharina; Oliver, Nuria; Passarella, Andrea; Passerini, Andrea; Pedreschi, Dino; Pentland, Alex; Pianesi, Fabio; Pratesi, Francesca; Rinzivillo, Salvatore; Ruggieri, Salvatore; Siebes, Arno; Torra, Vicenҫ; Trasarti, Roberto; Van Den Hoven, Jeroen; Vespignani, AlessandroThe rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the “phase 2” of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens’ privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens’ “personal data stores”, to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive forCOVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: It allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allowthe user to share spatio-temporal aggregates-if and when they want and for specific aims-with health authorities, for instance. Second, we favour a longerterm pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society. - Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä(2021-11) Nanni, Mirco; Andrienko, Gennady; Barabási, Albert László; Boldrini, Chiara; Bonchi, Francesco; Cattuto, Ciro; Chiaromonte, Francesca; Comandé, Giovanni; Conti, Marco; Coté, Mark; Dignum, Frank; Dignum, Virginia; Domingo-Ferrer, Josep; Ferragina, Paolo; Giannotti, Fosca; Guidotti, Riccardo; Helbing, Dirk; Kaski, Kimmo; Kertesz, Janos; Lehmann, Sune; Lepri, Bruno; Lukowicz, Paul; Matwin, Stan; Jiménez, David Megías; Monreale, Anna; Morik, Katharina; Oliver, Nuria; Passarella, Andrea; Passerini, Andrea; Pedreschi, Dino; Pentland, Alex; Pianesi, Fabio; Pratesi, Francesca; Rinzivillo, Salvatore; Ruggieri, Salvatore; Siebes, Arno; Torra, Vicenc; Trasarti, Roberto; Hoven, Jeroen van den; Vespignani, AlessandroThe rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the “phase 2” of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens’ privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens’ “personal data stores”, to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates—if and when they want and for specific aims—with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society. - Relevance of temporal cores for epidemic spread in temporal networks
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä(2020-07-27) Ciaperoni, Martino; Galimberti, Edoardo; Bonchi, Francesco; Cattuto, Ciro; Gullo, Francesco; Barrat, AlainTemporal networks are widely used to represent a vast diversity of systems, including in particular social interactions, and the spreading processes unfolding on top of them. The identification of structures playing important roles in such processes remains largely an open question, despite recent progresses in the case of static networks. Here, we consider as candidate structures the recently introduced concept of span-cores: the span-cores decompose a temporal network into subgraphs of controlled duration and increasing connectivity, generalizing the core-decomposition of static graphs. To assess the relevance of such structures, we explore the effectiveness of strategies aimed either at containing or maximizing the impact of a spread, based respectively on removing span-cores of high cohesiveness or duration to decrease the epidemic risk, or on seeding the process from such structures. The effectiveness of such strategies is assessed in a variety of empirical data sets and compared to baselines that use only static information on the centrality of nodes and static concepts of coreness, as well as to a baseline based on a temporal centrality measure. Our results show that the most stable and cohesive temporal cores play indeed an important role in epidemic processes on temporal networks, and that their nodes are likely to include influential spreaders. - Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU
A4 Artikkeli konferenssijulkaisussa(2023) Schultheis, Erik; Babbar, RohitIn classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive performance. Fortunately, we found that this can be mitigated by introducing an intermediate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be of constant fan-in, in the sense that each output neuron will have the exact same number of incoming connections, which allows for more efficient implementations, especially on GPU hardware. The CUDA implementation of our approach is provided at https://github.com/xmc-aalto/ecml23-sparse. - Weak Supervision and Clustering-Based Sample Selection for Clinical Named Entity Recognition
A4 Artikkeli konferenssijulkaisussa(2023) Sun, Wei; Ji, Shaoxiong; Denti, Tuulia; Moen, Hans; Kerro, Oleg; Rannikko, Antti; Marttinen, Pekka; Koskinen, MiikaOne of the central tasks of medical text analysis is to extract and structure meaningful information from plain-text clinical documents. Named Entity Recognition (NER) is a sub-task of information extraction that involves identifying predefined entities from unstructured free text. Notably, NER models require large amounts of human-labeled data to train, but human annotation is costly and laborious and often requires medical training. Here, we aim to overcome the shortage of manually annotated data by introducing a training scheme for NER models that uses an existing medical ontology to assign weak labels to entities and provides enhanced domain-specific model adaptation with in-domain continual pretraining. Due to limited human annotation resources, we develop a specific module to collect a more representative test dataset from the data lake than a random selection. To validate our framework, we invite clinicians to annotate the test set. In this way, we construct two Finnish medical NER datasets based on clinical records retrieved from a hospital’s data lake and evaluate the effectiveness of the proposed methods. The code is available at https://github.com/VRCMF/HAM-net.git.