Browsing by Author "Huusari, Riikka"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Applying machine learning to metabolite identification: Comparing Robust Loss Functions(2022-05-06) Pham, Binh; Huusari, Riikka; Perustieteiden korkeakoulu; Käpylä, MaaritItem Comparing interpretable machine learning methods used to phenotype bacterial antimicrobial resistance(2022-05-08) Merisaari, Taija; Huusari, Riikka; Sähkötekniikan korkeakoulu; Turunen, MarkusItem Connecting Secondary Metabolites and Biosynthetic Gene Clusters(2021-08-23) Oksanen, Minna; Huusari, Riikka; Perustieteiden korkeakoulu; Rousu, JuhoConnecting secondary metabolites and biosynthetic gene clusters has largely been performed using rule-based approaches which require prior knowledge and chemical understanding about the metabolites and biosynthetic gene clusters (BGCs). This work explores a scenario in which the two could be connected to each other using machine learning methods. Machine learning is considered as it has generalization ability to unseen data. The results show that there is some potential in machine learning methods when using a candidate set for the prediction task where a BGC/metabolite is directly predicted from a metabolite/BGC.Item Cross-view kernel transfer(Elsevier Limited, 2022-09) Huusari, Riikka; Capponi, Cécile; Villoutreix, Paul; Kadri, Hachem; Department of Computer Science; Professorship Rousu Juho; Aix-Marseille UniversitéWe consider the kernel completion problem with the presence of multiple views in the data. In this context the data samples can be fully missing in some views, creating missing columns and rows to the kernel matrices that are calculated individually for each view. We propose to solve the problem of completing the kernel matrices with Cross-View Kernel Transfer (CVKT) procedure, in which the features of the other views are transformed to represent the view under consideration. The transformations are learned with kernel alignment to the known part of the kernel matrix, allowing for finding generalizable structures in the kernel matrix under completion. Its missing values can then be predicted with the data available in other views. We illustrate the benefits of our approach with simulated data, multivariate digits dataset and multi-view dataset on gesture classification, as well as with real biological datasets from studies of pattern formation in early Drosophila melanogaster embryogenesis.Item Entangled Kernels - Beyond Separability(MICROTOME PUBL, 2021-01) Huusari, Riikka; Kadri, Hachem; Professorship Rousu Juho; Aix-Marseille Université; Department of Computer ScienceWe consider the problem of operator-valued kernel learning and investigate the possibility of going beyond the well-known separable kernels. Borrowing tools and concepts from the field of quantum computing, such as partial trace and entanglement, we propose a new view on operator-valued kernels and define a general family of kernels that encompasses previously known operator-valued kernels, including separable and transformable kernels. Within this framework, we introduce another novel class of operator-valued kernels called entangled kernels that are not separable. We propose an efficient two-step algorithm for this framework, where the entangled kernel is learned based on a novel extension of kernel alignment to operator-valued kernels. We illustrate our algorithm with an application to supervised dimensionality reduction, and demonstrate its effectiveness with both artificial and real data for multi-output regression.Item Learning interpretable predictive biomarkers from multi-omics data(2023-10-09) Paunio, Ellimari; Huusari, Riikka; Pusa, Taneli; Perustieteiden korkeakoulu; Rousu, JuhoAdvancements in technologies that generate large-scale omics data and the develop- ment of machine learning methods to analyze this data provide new opportunities for the field of medicine, such as improved prevention, diagnosis and treatment of diseases through the application of multivariate biomarkers. Moreover, multi- variate biomarkers offer opportunities for precision medicine where treatments can be tailored to the needs of individual patients. Multivariate biomarker discovery which involves the prediction of clinical outcomes reproducibly using a small set of biomarkers, has emerged as a promising approach. However, from a machine learning perspective, the integration of multi-omics data to discover multi-omics biomarkers remains challenging. In addition, interpretability and explainability are key issues in the translation of models into clinical practice. Recently proposed group of kernel methods called sparse pre-image kernel machines has an embedded feature selection and offers improved interpretability compared to traditional kernel methods. Another benefit for learning multi-omics biomarkers is that sparse pre-image kernel machines can be extended to multi-view learning. This thesis explores the application of sparse pre-image kernel machines to multivariate biomarker discovery using a multi-omics coronavirus disease 2019 data set. To study whether the stability of feature selection can be improved, this thesis couples a method known as stability selection with sparse pre-image kernel machines. The stability of feature selection and model performance with the selected features are compared to two baseline methods, random forest and logistic regression. This thesis considers two types of feature selection pipelines for sparse pre-image kernel machines, where the first is a general grid search approach to select a level of regularization, and thus features. In the second pipeline, sparse pre-image kernel machines is combined with stability selection. Results show that stability selection improves the stability of the learned features significantly. In addition, the proposed multi-view approach learns a more balanced set of features compared to other methods in terms of learning features from both views. The findings of this thesis provide insights into the potential application of sparse pre-image kernel machines for the discovery of multi-omics biomarkers in complex diseases.Item Molecular data representation for structured prediction(2022-04-15) Tran, Duong; Huusari, Riikka; Perustieteiden korkeakoulu; Käpylä, MaaritItem Partial Trace Regression and Low-Rank Kraus Decomposition(MLRP, 2020) Kadri, Hachem; Ayache, Stéphane; Huusari, Riikka; Rakotomamonjy, Alain; Ralaivola, Liva; Department of Computer Science; Professorship Rousu Juho; Aix-Marseille Université; Université de Rouen; Criteo AI LabThe trace regression model, a direct extension of the well-studied linear regression model, al-lows one to map matrices to real-valued outputs.We here introduce an even more general model,namely the partial-trace regression model, a family of linear mappings from matrix-valued inputs to matrix-valued outputs; this model subsumes the trace regression model and thus the linear regression model. Borrowing tools from quantum information theory, where partial trace operators have been extensively studied, we propose a framework for learning partial trace regression models from data by taking advantage of the so-called low-rank Kraus representation of completely positive maps.We show the relevance of our framework with synthetic and real-world experiments conducted for both i) matrix-to-matrix regression and ii) positive semidefinite matrix completion, two tasks which can be formulated as partial trace regression problems.Item A study of matrix-valued data in supervised machine learning problems(2023-05-19) Nguyen, Khac; Huusari, Riikka; Perustieteiden korkeakoulu; Korpi-Lagg, Maarit