Browsing by Author "Cichonska, Anna"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
Item Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors(2017-08-01) Cichonska, Anna; Ravikumar, Balaguru; Parri, Elina; Timonen, Sanna; Pahikkala, Tapio; Airola, Antti; Wennerberg, Krister; Rousu, Juho; Aittokallio, Tero; Department of Computer Science; Professorship Rousu Juho; Helsinki Institute for Information Technology (HIIT); University of Helsinki; University of TurkuDue to relatively high costs and labor required for experimental profiling of the full target space of chemical compounds, various machine learning models have been proposed as cost-effective means to advance this process in terms of predicting the most potent compound-target interactions for subsequent verification. However, most of the model predictions lack direct experimental validation in the laboratory, making their practical benefits for drug discovery or repurposing applications largely unknown. Here, we therefore introduce and carefully test a systematic computational-experimental framework for the prediction and pre-clinical verification of drug-target interactions using a well-established kernel-based regression algorithm as the prediction model. To evaluate its performance, we first predicted unmeasured binding affinities in a large-scale kinase inhibitor profiling study, and then experimentally tested 100 compound-kinase pairs. The relatively high correlation of 0.77 (p < 0.0001) between the predicted and measured bioactivities supports the potential of the model for filling the experimental gaps in existing compound-target interaction maps. Further, we subjected the model to a more challenging task of predicting target interactions for such a new candidate drug compound that lacks prior binding profile information. As a specific case study, we used tivozanib, an investigational VEGF receptor inhibitor with currently unknown off-target profile. Among 7 kinases with high predicted affinity, we experimentally validated 4 new off-targets of tivozanib, namely the Src-family kinases FRK and FYN A, the non-receptor tyrosine kinase ABL1, and the serine/threonine kinase SLK. Our sub-sequent experimental validation protocol effectively avoids any possible information leakage between the training and validation data, and therefore enables rigorous model validation for practical applications. These results demonstrate that the kernel-based modeling approach offers practical benefits for probing novel insights into the mode of action of investigational compounds, and for the identification of new target selectivities for drug repurposing applications.Item Discovery of Mycobacterium tuberculosis gene expression biomarkers for drug therapy response(2015-10-19) Kuittinen, Iitu; Cichonska, Anna; Sähkötekniikan korkeakoulu; Rousu, JuhoTuberculosis (TB), caused by Mycobacterium tuberculosis (M.tb) in humans, remains a major world-wide medical challenge, affecting approximately one-third of the world’s population and killing 1.5 million people every year. New drug targets and biomarkers for treatment response are urgently needed to accelerate the development of new therapeutics and to help identify hard-to-treat patients. Until now, the possibility of using M.tb gene expression signatures as such biomarkers has not systematically been explored. This thesis models disease manifestations and treatment responses during the first 2 weeks of the standard drug therapy using whole-genome gene expression data of sputum M.tb. M.tb bacilli in sputum is dominated by a population that demonstrates phenotypic tolerance to the currently used drugs, thus making the adaptation mechanisms of this population critical to determine. For the modeling purposes, advanced machine learning methods are used, including: Principal Component Analysis, stability selection, Support Vector Machine, lasso regression, and time-course differential expression analysis based on Gaussian Processes. The main contributions of this work are as follows: 1) it shows evidence on strong associations between M.tb transcriptional patterns and patient treatment response; 2) it identifies specific mRNA signatures that predict disease severity (e.g., chest X-ray score) and early treatment success (e.g., time to positivity and TB status at week 8) with success rates of at least 89%; 3) it indicates the existence of two main patterns of mycobacterial gene expression response to early treatment; and 4) it establishes a computational framework that is well-suited for mining gene expression signatures predictive of a clinical outcome. Although the identified biomarkers require further validation, they serve as potential targets for future drug and biomarker development. Overall, the results propose a novel biomarker discovery strategy.Item Kernel-based machine learning approaches to drug-protein interaction prediction(2016-12-22) Julkunen, Heli; Cichonska, Anna; Sähkötekniikan korkeakoulu; Turunen, MarkusItem Learning with multiple pairwise kernels for drug bioactivity prediction(2018-07-01) Cichonska, Anna; Pahikkala, Tapio; Szedmak, Sandor; Julkunen, Heli; Airola, Antti; Heinonen, Markus; Aittokallio, Tero; Rousu, Juho; Department of Computer Science; Professorship Rousu Juho; Helsinki Institute for Information Technology (HIIT); Professorship Lähdesmäki Harri; Centre of Excellence in Molecular Systems Immunology and Physiology Research Group, SyMMys; University of Turku; Aalto UniversityMotivation: Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs. Results: We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem.Item Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects(Nature Publishing Group, 2020-12-01) Julkunen, Heli; Cichonska, Anna; Gautam, Prson; Szedmak, Sandor; Douat, Jane; Pahikkala, Tapio; Aittokallio, Tero; Rousu, Juho; Department of Computer Science; Professorship Rousu Juho; Helsinki Institute for Information Technology (HIIT); Computer Science - Computational Life Sciences (CSLife); University of Helsinki; Department of Computer Science; University of Turku; Aalto UniversityWe present comboFM, a machine learning framework for predicting the responses of drug combinations in pre-clinical studies, such as those based on cell lines or patient-derived cells. comboFM models the cell context-specific drug interactions through higher-order tensors, and efficiently learns latent factors of the tensor using powerful factorization machines. The approach enables comboFM to leverage information from previous experiments performed on similar drugs and cells when predicting responses of new combinations in so far untested cells; thereby, it achieves highly accurate predictions despite sparsely populated data tensors. We demonstrate high predictive performance of comboFM in various prediction scenarios using data from cancer cell line pharmacogenomic screens. Subsequent experimental validation of a set of previously untested drug combinations further supports the practical and robust applicability of comboFM. For instance, we confirm a novel synergy between anaplastic lymphoma kinase (ALK) inhibitor crizotinib and proteasome inhibitor bortezomib in lymphoma cells. Overall, our results demonstrate that comboFM provides an effective means for systematic pre-screening of drug combinations to support precision oncology applications.Item metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis(2016-07-01) Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimaki, Terho; Raitakari, Olli; Jarvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti; Helsinki Insititute for Information Technology HIIT; Department of Computer Science; Professorship Rousu Juho; Helsinki Institute for Information Technology (HIIT); Centre of Excellence in Computational Inference, COINMotivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies.Item Modeling drug combination effects via latent tensor reconstruction(OXFORD UNIV PRESS INC, 2021-07-01) Wang, Tianduanyi; Szedmak, Sandor; Wang, Haishan; Aittokallio, Tero; Pahikkala, Tapio; Cichonska, Anna; Rousu, Juho; Department of Computer Science; Helsinki Institute for Information Technology (HIIT); Professorship Rousu Juho; Computer Science Professors; Computer Science - Large-scale Computing and Data Analysis (LSCA); Computer Science - Artificial Intelligence and Machine Learning (AIML); Computer Science - Computational Life Sciences (CSLife); Department of Computer Science; Aalto University; University of TurkuMotivation: Combination therapies have emerged as a powerful treatment modality to overcome drug resistance and improve treatment efficacy. However, the number of possible drug combinations increases very rapidly with the number of individual drugs in consideration, which makes the comprehensive experimental screening infeasible in practice. Machine-learning models offer time-A nd cost-efficient means to aid this process by prioritizing the most effective drug combinations for further pre-clinical and clinical validation. However, the complexity of the underlying interaction patterns across multiple drug doses and in different cellular contexts poses challenges to the predictive modeling of drug combination effects. Results: We introduce comboLTR, highly time-efficient method for learning complex, non-linear target functions for describing the responses of therapeutic agent combinations in various doses and cancer cell-contexts. The method is based on a polynomial regression via powerful latent tensor reconstruction. It uses a combination of recommender system-style features indexing the data tensor of response values in different contexts, and chemical and multi-omics features as inputs. We demonstrate that comboLTR outperforms state-of-the-art methods in terms of predictive performance and running time, and produces highly accurate results even in the challenging and practical inference scenario where full dose-response matrices are predicted for completely new drug combinations with no available combination and monotherapy response measurements in any training cell line.Item Predicting Drug-Target Interactions(2016-12-18) Vuola, Aarno; Cichonska, Anna; Tietotekniikan laitos; Perustieteiden korkeakoulu; Kannala, JuhoItem Predictive Modeling of Anticancer Efficacy of Drug Combinations Using Factorization Machines(2019-06-17) Julkunen, Heli; Cichonska, Anna; Perustieteiden korkeakoulu; Rousu, JuhoCo-administration of drugs is a widely used strategy in cancer treatment to prevent drug resistance and improve the therapeutic efficacy while reducing the toxicity and side effects of the treatment. Despite their effectiveness, new combination therapies have been slow to emerge, as selecting and testing potential drug combinations against various cancer cell lines remains time- and cost inefficient. During the recent years, machine learning methods have emerged as powerful means to aid the drug development process. However, the underlying dose response matrix structure of drug combination data and the complexity of drug interaction patterns observed across various dose pairs poses challenges to accurate modeling of drug combination effects. In this thesis, we present a novel machine learning framework for predicting the therapeutic efficacy of drug combinations in human cancer cell lines using factorization machines, a recent model class designed for efficient modeling of higher-order feature interactions. We base our work on the observation that the underlying dose-response data can be compiled into a higher-order tensor indexed by drugs, drug concentrations and cell lines. The drug combination responses can then be modeled as an interaction between these different domains. We tested the model using the publicly available NCI-ALMANAC dataset on pairwise drug combinations screened in various concentration pairs across the NCI-60 panel of human cancer cell lines. The proposed method showed high predictive accuracy not only in filling in missing entries in otherwise known dose-response matrices, but also in a more challenging and practical setting of extending the predictions to new drug combinations not observed in the training space. The obtained results demonstrate that the framework provides promising means for systematic pre-screening of drug combinations for their therapeutic potential, thus holding promise to support precision medicine efforts.Item Profiling persistent tubercule bacilli from patient sputa during therapy predicts early drug efficacy(2016-04-07) Honeyborne, Isobella; McHugh, Timothy D.; Kuittinen, Iitu; Cichonska, Anna; Evangelopoulos, Dimitrios; Ronacher, Katharina; van Helden, Paul D.; Gillespie, Stephen H.; Fernandez-Reyes, Delmiro; Walzl, Gerhard; Rousu, Juho; Butcher, Philip D.; Waddell, Simon J.; Department of Computer Science; Professorship Rousu Juho; University College London; Stellenbosch University; University of St Andrews; St. George's University of London; University of SussexBackground: New treatment options are needed to maintain and improve therapy for tuberculosis, which caused the death of 1.5 million people in 2013 despite potential for an 86 % treatment success rate. A greater understanding of Mycobacterium tuberculosis (M.tb) bacilli that persist through drug therapy will aid drug development programs. Predictive biomarkers for treatment efficacy are also a research priority. Methods and Results: Genome-wide transcriptional profiling was used to map the mRNA signatures of M.tb from the sputa of 15 patients before and 3, 7 and 14 days after the start of standard regimen drug treatment. The mRNA profiles of bacilli through the first 2 weeks of therapy reflected drug activity at 3 days with transcriptional signatures at days 7 and 14 consistent with reduced M.tb metabolic activity similar to the profile of pre-chemotherapy bacilli. These results suggest that a pre-existing drug-tolerant M.tb population dominates sputum before and after early drug treatment, and that the mRNA signature at day 3 marks the killing of a drug-sensitive sub-population of bacilli. Modelling patient indices of disease severity with bacterial gene expression patterns demonstrated that both microbiological and clinical parameters were reflected in the divergent M.tb responses and provided evidence that factors such as bacterial load and disease pathology influence the host-pathogen interplay and the phenotypic state of bacilli. Transcriptional signatures were also defined that predicted measures of early treatment success (rate of decline in bacterial load over 3 days, TB test positivity at 2 months, and bacterial load at 2 months). Conclusions: This study defines the transcriptional signature of M.tb bacilli that have been expectorated in sputum after two weeks of drug therapy, characterizing the phenotypic state of bacilli that persist through treatment. We demonstrate that variability in clinical manifestations of disease are detectable in bacterial sputa signatures, and that the changing M.tb mRNA profiles 0-2 weeks into chemotherapy predict the efficacy of treatment 6 weeks later. These observations advocate assaying dynamic bacterial phenotypes through drug therapy as biomarkers for treatment success.Item Tensor Decomposition in Multiple Kernel Learning(2017-08-28) Nguyen, Van; Szedmak, Sandor; Cichonska, Anna; Perustieteiden korkeakoulu; Rousu, JuhoModern data processing and analytic tasks often deal with high dimensional matrices or tensors; for example: environmental sensors monitor (time, location, temperature, light) data. For large scale tensors, efficient data representation plays a major role in reducing computational time and finding patterns. The thesis firstly studies about fundamental matrix, tensor decomposition algorithms and applications, in connection with Tensor Train decomposition algorithm. The second objective is applying the tensor perspective in Multiple Kernel Learning problems, where the stacking of kernels can be seen as a tensor. Decomposition this kind of tensor leads to an efficient factorization approach in finding the best linear combination of kernels through the similarity alignment. Interestingly, thanks to the symmetry of the kernel matrix, a novel decomposition algorithm for multiple kernels is derived for reducing the computational complexity. In term of applications, this new approach allows the manipulation of large scale multiple kernels problems. For example, with P kernels and n samples, it reduces the memory complexity of O(P^2n^2) to O(P^2r^2+ 2rn) where r < n is the number of low-rank components. This compression is also valuable in pair-wise multiple kernel learning problem which models the relation among pairs of objects and its complexity is in the double scale. This study proposes AlignF_TT, a kernel alignment algorithm which is based on the novel decomposition algorithm for the tensor of kernels. Regarding the predictive performance, the proposed algorithm can gain an improvement in 18 artificially constructed datasets and achieve comparable performance in 13 real-world datasets in comparison with other multiple kernel learning algorithms. It also reveals that the small number of low-rank components is sufficient for approximating the tensor of kernels.