Browsing by Author "Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Interaction Detection with Probabilistic Deep Learning for Genetics(Aalto University, 2023) Cui, Tianyu; Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland; Tietotekniikan laitos; Department of Computer Science; Probabilistic Machine Learning; Perustieteiden korkeakoulu; School of Science; Kaski, Samuel, Prof., Aalto University, Department of Computer Science, FinlandDeep learning is an important machine learning tool in genetics because of its ability to model nonlinear relations between genotypes and phenotypes, such as genetic interactions, without any assumptions about the forms of relations. However, current deep learning approaches are restricted in genetics applications by (i) the lack of well-calibrated uncertainty estimation about the model and (ii) limited accessible individual-level data for model training. This thesis aims to design principled approaches to tackle the shortcomings of deep learning with two relevant statistical genetics applications: gene-gene interaction detection and genotype-phenotype prediction. First, we focus on interaction detection with deep learning. We provide calibrated uncertainty estimations to interaction detection in deep learning with Bayesian principles, which are used to control statistical errors, e.g., false positive rate and false negative rate, of detected interactions. In genetic interaction detection applications, we design a novel neural network architecture to increase the power of detecting complex gene-gene interactions by learning gene representations that aggregate information from all SNPs (single-nucleotide polymorphisms) of the genes being analyzed and considering complex interaction forms between them beyond only the currently considered multiplicative interactions. Moreover, we propose a new permutation procedure that gives calibrated null distributions of genetic interactions from the neural network. Second, we study deep learning models in the low-data regime. We improve deep learning prediction by incorporating domain knowledge with informative priors. Specifically, we design informative Gaussian scale mixture priors that explicitly encode prior beliefs about feature sparsity and data signal-to-noise ratio into deep learning models, which improve their accuracy on regression tasks, such as genotype-phenotype prediction, especially when only a small training set is available. Moreover, we study how to understand better the working mechanism of low-data deep learning models that share knowledge from multiple similar domains, such as transfer learning, with representation similarity. We find that current representation similarities of deep learning models on multiple domains give counter-intuitive conclusions about their functional similarities due to the confounding effect of the input data structure. Therefore, we introduce a deconfounding step to adjust for the confounder, which improves the consistency of representation similarities w.r.t. functional similarities of models.Item Natural Language Processing for Healthcare: Text Representation, Multitask Learning, and Applications(Aalto University, 2023) Ji, Shaoxiong; Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland; Tietotekniikan laitos; Department of Computer Science; Perustieteiden korkeakoulu; School of Science; Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, FinlandThe emergence of deep learning algorithms in natural language processing has boosted the development of intelligent medical information systems. Firstly, this dissertation explores effective text encoding for clinical text. We propose a dilated convolutional attention network with dilated convolutions to capture complex medical patterns in long clinical notes by exponentially increasing the receptive field with the dilation size. Furthermore, we propose to utilize embedding injection and gated information propagation in the medical note encoding module for better representation learning of the lengthy clinical text. To capture the interaction between notes and codes, we explicitly model the underlying dependency between notes and codes and utilize textual descriptions of medical codes as external knowledge. We also adopt the contextualized graph embeddings to learn contextual information and causal relationships between text mentions such as drugs taken and adverse reactions. We also conduct an empirical analysis on the effectiveness of transfer learning with language model pretraining to clinical text encoding and medical code prediction. We develop a hierarchical encoding model to equip the pretrained language models with the capacity to encode long clinical notes. We further study the effect of pretraining in different domains and with different strategies. The comprehensive quantitative analysis shows that hierarchical encoding can capture interactions between distant words to some extent. Then, this dissertation investigates the multitask learning paradigm and its applications to healthcare. Multitask learning, motivated by human learning from previous tasks to help with a new task, makes full use of the information contained in each task and shares information between related tasks through common parameters. We adopt multitask learning for medical code prediction and demonstrate the benefits of leveraging multiple coding schemes. We design a recalibrated aggregation module to generate clinical document features with better quality and less noise in the shared modules of multitask networks. Finally, we consider the task context to improve multitask learning for healthcare. We propose to use a domain-adaptive pretrained model and hypernetwork-guided multitask heads to learn shared representation modules and task-specific predictors. Specifically, the domain-adaptive pretrained model is directly pretrained in the target domain of clinical applications. Task embeddings as task context are used to generate task-specific parameters with hypernetworks. Experiments show that the proposed hypernetwork-guided multitask learning method can achieve better predictive performance and semantic task information can improve the generalizability of the task-conditioned multitask model.