Machine Learning for Networked Data

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2020-02-10
Degree programme
46 + app. 52
Aalto University publication series DOCTORAL DISSERTATIONS, 10/2020
The data arising in many important applications can be represented as networks. This network representation can be used to encode high-dimensional statistical relations in probabilistic graphical models (PGM). Network models allow extending (deterministic) methods of discrete-time signal processing to networked data. This dissertation studies two fundamental problems arising within the processing of networked data. The first problem is semi-supervised learning where given the network structure and some labeled data points, one aims to learn a predictor for the labels of every data point. A second core problem is the learning of a network structure in a fully data-driven fashion. We approach this structure learning problem using a probabilistic model for the data. This results in a graphical model selection problem (GMS). Using the underlying network structure of data, it is possible to learn an accurate predictor from few labeled data points. This dissertation provides conditions on the available labels concerning the network structure such that accurate learning is possible by convex optimization methods. We apply the network Lasso which is an instance of regularized risk minimization using the total variation as regularizer. The conditions are derived by characterizing the solutions of network Lasso. GMS methods learn a network structure based on the statistical relations between data points which are modeled as random variables. A key challenge in the application of GMS methods is a precise understanding of the required number of data points for accurate GMS. This dissertation characterizes the required sample size of zero-mean Gaussian random processes.
Supervising professor
Jung, Alexander, Prof., Aalto University, Department of Computer Science, Finland
Thesis advisor
Jung, Alexander, Prof., Aalto University, Department of Computer Science, Finland
networked data, graphical model selection, semi-supervised learning
Other note
  • [Publication 1]: Nguyen Q. Tran, A. Jung. On the Sample Complexity of Graphical Model Selection for Non-Stationary Samples. In proceedings of 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6314-6317, Apr 2018.
  • [Publication 2]: Nguyen Tran, O. Abramenko, A. Jung. On the Sample Complexity of Graphical Model Selection from Non-Stationary Samples. In IEEE Transactions on Signal Processing, pp. 17-32, vol. 68, 2020.
    DOI: 10.1109/TSP.2019.2956687 View at publisher
  • [Publication 3]: A. Jung, Nguyen Tran, A. Mara. When is Network Lasso Accurate? Frontiers in Applied Mathematics and Statistics, 2018, Volume 3, Article 28. Full Text in Acris/Aaltodoc:
    DOI: 10.3389/fams.2017.00028 View at publisher
  • [Publication 4]: H. Ambos, N. Tran, A. Jung. Classifying Big Data Over Networks Via The Logistic Network Lasso. In Proceedings of the 52nd Asilomar Conference on Signals, Systems, and Computers, 2018, pages 855-858.
    DOI: 10.1109/ACSSC.2018.8645260 View at publisher
  • [Publication 5]: Alexander Jung, Nguyen Tran. Localized Linear Regression in Networked Data. IEEE Signal Processing Letters, 2019, volume 26, number 7, pages 1090-1094.
    DOI: 10.1109/LSP.2019.2918933 View at publisher