### Browsing by Author "Cui, Tianyu"

Now showing 1 - 11 of 11

###### Results Per Page

###### Sort Options

Item Deconfounded Representation Similarity for Comparison of Neural Networks(Morgan Kaufmann Publishers, 2022) Cui, Tianyu; Kumar, Yogesh; Marttinen, Pekka; Kaski, Samuel; Department of Computer Science; Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A.; Probabilistic Machine Learning; Professorship Kaski Samuel; Professorship Marttinen P.; Computer Science Professors; Computer Science - Artificial Intelligence and Machine Learning (AIML); Finnish Center for Artificial Intelligence, FCAI; Helsinki Institute for Information Technology (HIIT)Similarity metrics such as representational similarity analysis (RSA) and centered kernel alignment (CKA) have been used to understand neural networks by comparing their layer-wise representations. However, these metrics are confounded by the population structure of data items in the input space, leading to inconsistent conclusions about the \emph{functional} similarity between neural networks, such as spuriously high similarity of completely random neural networks and inconsistent domain relations in transfer learning. We introduce a simple and generally applicable fix to adjust for the confounder with covariate adjustment regression, which improves the ability of CKA and RSA to reveal functional similarity and also retains the intuitive invariance properties of the original similarity measures. We show that deconfounding the similarity metrics increases the resolution of detecting functionally similar neural networks across domains. Moreover, in real-world applications, deconfounding improves the consistency between CKA and domain similarity in transfer learning, and increases the correlation between CKA and model out-of-distribution accuracy similarity.Item Deep Bayesian Experimental Design for Drug Discovery(2024-09-20) Masood, Muhammad Arslan; Cui, Tianyu; Kaski, Samuel; Department of Computer Science; Clevert, Djork-Arné; Wand, Michael; Schmidhuber, Jürgen; Malinovská, Kristína; Tetko, Igor V.; Probabilistic Machine Learning; Professorship Kaski Samuel; Computer Science Professors; Computer Science - Artificial Intelligence and Machine Learning (AIML); Finnish Center for Artificial Intelligence, FCAI; Helsinki Institute for Information Technology (HIIT)In drug discovery, prioritizing compounds for testing is an important task. Active learning can assist in this endeavor by prioritizing molecules for label acquisition based on their estimated potential to enhance in-silico models. However, in specialized cases like toxicity modeling, limited dataset sizes can hinder effective training of modern neural networks for representation learning and to perform active learning. In this study, we leverage a transformer-based BERT model pretrained on millions of SMILES to perform active learning. Additionally, we explore different acquisition functions to assess their compatibility with pretrained BERT model. Our results demonstrate that pretrained models enhance active learning outcomes. Furthermore, we observe that active learning selects a higher proportion of positive compounds compared to random acquisition functions, an important advantage, especially in dealing with imbalanced toxicity datasets. Through a comparative analysis, we find that both BALD and EPIG acquisition functions outperform random acquisition, with EPIG exhibiting slightly superior performance over BALD. In summary, our study highlights the effectiveness of active learning in conjunction with pretrained models to tackle the problem of data scarcity.Item Gene–gene interaction detection with deep learning(Nature Publishing Group, 2022-11-12) Cui, Tianyu; El Mekkaoui, Khaoula; Reinvall, Jaakko; Havulinna, Aki S.; Marttinen, Pekka; Kaski, Samuel; Department of Computer Science; Probabilistic Machine Learning; Professorship Kaski Samuel; Computer Science Professors; Computer Science - Artificial Intelligence and Machine Learning (AIML); Professorship Marttinen P.; Finnish Center for Artificial Intelligence, FCAI; Helsinki Institute for Information Technology (HIIT)The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.Item Handling missing values with hybrid approaches in supervised setting(2022-03-21) Qian, Zhiheng; Cui, Tianyu; Perustieteiden korkeakoulu; Marttinen, PekkaMissing data has become an increasingly important issue for training deep neural networks, especially in the case of large-scale datasets. It can be categorized into three groups: data missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). Several state of the arts (SOTA) data imputation algorithms have been proposed to improve the handling of in- complete datasets. However, the effect of label distribution in supervised learning has been overlooked. Recent studies show that it is important to learn the label distribution conditionally on the missing data, which leads to huge performance gains. The aim of this thesis is to implement a hybrid approach that utilizes a generative deep latent variable model (DLVM) and a discriminative model to impute MAR data with importance weighted variational inference, including three train- ing strategies that outperform zero imputation. Furthermore, we introduce the label distribution into the hybrid model that consists of a DLVM model and a convolutional neural network (CNN), in the context of MNAR image data. The experiment shows that the joint model achieves extraordinary prediction accuracy and imputation result in the MNIST dataset.Item Incorporating functional summary information in Bayesian neural networks using a Dirichlet process likelihood approach(JMLR, 2023) Raj, Vishnu; Cui, Tianyu; Heinonen, Markus; Marttinen, Pekka; Department of Computer Science; Ruiz, Francisco; Dy, Jennifer; van de Meent, Jan-Willem; Professorship Marttinen P.; Probabilistic Machine Learning; Professorship Kaski Samuel; Professorship Lähdesmäki Harri; Computer Science Professors; Computer Science - Artificial Intelligence and Machine Learning (AIML)Bayesian neural networks (BNNs) can account for both aleatoric and epistemic uncertainty. However, in BNNs the priors are often specified over the weights which rarely reflects true prior knowledge in large and complex neural network architectures. We present a simple approach to incorporate prior knowledge in BNNs based on external summary information about the predicted classification probabilities for a given dataset. The available summary information is incorporated as augmented data and modeled with a Dirichlet process, and we derive the corresponding Summary Evidence Lower BOund. The approach is founded on Bayesian principles, and all hyperparameters have a proper probabilistic interpretation. We show how the method can inform the model about task difficulty and class imbalance. Extensive experiments show that, with negligible computational overhead, our method parallels and in many cases outperforms popular alternatives in accuracy, uncertainty calibration, and robustness against corruptions with both balanced and imbalanced data.Item Informative Bayesian Neural Network Priors for Weak Signals(INT SOC BAYESIAN ANALYSIS, 2021) Cui, Tianyu; Havulinna, Aki S.; Marttinen, Pekka; Kaski, Samuel; Department of Computer Science; Probabilistic Machine Learning; Professorship Kaski Samuel; Computer Science Professors; Computer Science - Artificial Intelligence and Machine Learning (AIML); Professorship Marttinen P.; Finnish Center for Artificial Intelligence, FCAI; Helsinki Institute for Information Technology (HIIT); Finnish Institute for Health and Welfare (THL)Encoding domain knowledge into the prior over the high-dimensional weight space of a neural network is challenging but essential in applications with limited data and weak signals. Two types of domain knowledge are commonly available in scientific applications: 1. feature sparsity (fraction of features deemed relevant); 2. signal-to-noise ratio, quantified, for instance, as the proportion of variance explained. We show how to encode both types of domain knowledge into the widely used Gaussian scale mixture priors with Automatic Relevance Determination. Specifically, we propose a new joint prior over the local (i.e., feature-specific) scale parameters that encodes knowledge about feature sparsity, and a Stein gradient optimization to tune the hyperparameters in such a way that the distribution induced on the model's proportion of variance explained matches the prior distribution. We show empirically that the new prior improves prediction accuracy compared to existing neural network priors on publicly available datasets and in a genetics application where signals are weak and sparse, often outperforming even computationally intensive cross-validation for hyperparameter tuning.Item Interaction detection by deep neural nets with application in genomics(2019-09-01) Marjakangas, Joona; Cui, Tianyu; Perustieteiden korkeakoulu; Hyvönen, EeroItem Interaction Detection with Probabilistic Deep Learning for Genetics(Aalto University, 2023) Cui, Tianyu; Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland; Tietotekniikan laitos; Department of Computer Science; Probabilistic Machine Learning; Perustieteiden korkeakoulu; School of Science; Kaski, Samuel, Prof., Aalto University, Department of Computer Science, FinlandDeep learning is an important machine learning tool in genetics because of its ability to model nonlinear relations between genotypes and phenotypes, such as genetic interactions, without any assumptions about the forms of relations. However, current deep learning approaches are restricted in genetics applications by (i) the lack of well-calibrated uncertainty estimation about the model and (ii) limited accessible individual-level data for model training. This thesis aims to design principled approaches to tackle the shortcomings of deep learning with two relevant statistical genetics applications: gene-gene interaction detection and genotype-phenotype prediction. First, we focus on interaction detection with deep learning. We provide calibrated uncertainty estimations to interaction detection in deep learning with Bayesian principles, which are used to control statistical errors, e.g., false positive rate and false negative rate, of detected interactions. In genetic interaction detection applications, we design a novel neural network architecture to increase the power of detecting complex gene-gene interactions by learning gene representations that aggregate information from all SNPs (single-nucleotide polymorphisms) of the genes being analyzed and considering complex interaction forms between them beyond only the currently considered multiplicative interactions. Moreover, we propose a new permutation procedure that gives calibrated null distributions of genetic interactions from the neural network. Second, we study deep learning models in the low-data regime. We improve deep learning prediction by incorporating domain knowledge with informative priors. Specifically, we design informative Gaussian scale mixture priors that explicitly encode prior beliefs about feature sparsity and data signal-to-noise ratio into deep learning models, which improve their accuracy on regression tasks, such as genotype-phenotype prediction, especially when only a small training set is available. Moreover, we study how to understand better the working mechanism of low-data deep learning models that share knowledge from multiple similar domains, such as transfer learning, with representation similarity. We find that current representation similarities of deep learning models on multiple domains give counter-intuitive conclusions about their functional similarities due to the confounding effect of the input data structure. Therefore, we introduce a deconfounding step to adjust for the confounder, which improves the consistency of representation similarities w.r.t. functional similarities of models.Item Learning Global Pairwise Interactions with Bayesian Neural Networks(IOS Press, 2020-08-24) Cui, Tianyu; Marttinen, Pekka; Kaski, Samuel; Department of Computer Science; De Giacomo, Giuseppe; Catala, Alejandro; Dilkina, Bistra; Milano, Michela; Barro, Senen; Bugarin, Alberto; Lang, Jerome; Probabilistic Machine Learning; Professorship Kaski Samuel; Professorship Marttinen P.; Centre of Excellence in Computational Inference, COIN; Finnish Center for Artificial Intelligence, FCAI; Helsinki Institute for Information Technology (HIIT)Estimating global pairwise interaction effects, i.e., the difference between the joint effect and the sum of marginal effects of two input features, with uncertainty properly quantified, is centrally important in science applications. We propose a non-parametric probabilistic method for detecting interaction effects of unknown form. First, the relationship between the features and the output is modelled using a Bayesian neural network, capable of representing complex interactions and principled uncertainty. Second, interaction effects and their uncertainty are estimated from the trained model. For the second step, we propose an intuitive global interaction measure: Bayesian Group Expected Hessian (GEH), which aggregates information of local interactions as captured by the Hessian. GEH provides a natural trade-off between type I and type II error and, moreover, comes with theoretical guarantees ensuring that the estimated interaction effects and their uncertainty can be improved by training a more accurate BNN. The method empirically outperforms available non-probabilistic alternatives on simulated and real-world data. Finally, we demonstrate its ability to detect interpretable interactions between higher-level features (at deeper layers of the neural network).Item Learning to share in multi-task deep learning(2021-09-10) Nguyen, Long; Cui, Tianyu; Perustieteiden korkeakoulu; Käpylä, MaaritItem Uncertainty in Recurrent Neural Network with Dropout(2020-08-18) Nguyen, Kha; Cui, Tianyu; Ajanki, Antti; Perustieteiden korkeakoulu; Marttinen, PekkaRecurrent Neural Network is a powerful tool for processing temporal data. However, assessing prediction uncertainty from recurrent models has proven challenging. This thesis attempts to evaluate the validity of uncertainty from recurrent models using dropout. Traditional neural network focuses on optimising data likelihood; in order to obtain model and predictive uncertainty, we need to, instead, optimise model posterior. Model posterior is usually intractable, thus we employ various dropout based approach, in the form of variational Bayesian Monte Carlo, to estimate the learning objective. This technique is applied to existing recurrent neural network benchmarks MIMIC-III. The thesis shows that Monte Carlo dropout applied to recurrent neural network can give comparable performance to the current state of the art methods, and meaningful uncertainty of predictions.