Machine Learning for Systems Pharmacology

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2018-10-05
Degree programme
108 + app. 90
Aalto University publication series DOCTORAL DISSERTATIONS, 168/2018
Systems pharmacology aims to transform large-scale heterogenous clinical and biological data into actionable therapeutic strategies. This thesis develops practical machine learning frameworks that contribute to different aspects of systems pharmacology, including the determination of therapeutic drug targets for complex diseases through genome-wide association studies (GWAS), and elucidation of molecular and phenotypic drug responses.  GWAS has identified thousands of associations between genetic variants (genotype) and disease traits (phenotype), many of which can be used to prioritise the corresponding gene products as potential drug targets. Further discoveries will likely be unveiled with larger experimental sample sizes and analysing multiple variants and disease traits together (i.e. multivariate analysis), instead of the standard association testing between each variant and trait separately (i.e. univariate analysis). However, the current challenges of GWAS include modest sample sizes of separate study cohorts and restricted access to individual-level data across the cohorts for the multivariate meta-analysis.  Machine learning methods provide a cost-effective and complementary approach to experimental drug bioactivity profiling, including elucidation of both direct interaction partners and overall phenotypic responses of drugs. Recently, especially kernel-based methods have received significant attention in pharmacology offering, among others, the advantage of modelling the nonlinearities between chemical and genomic features and drug bioactivity profiles. The main contributions of this thesis are as follows. We developed metaCCA, a framework for the multivariate meta-analysis of GWAS that extends canonical correlation analysis to the setting where individual-level genotype and phenotype data are not available. metaCCA is the first summary statistics-based method that allows testing for associations between multiple genetic variants and multiple traits. It holds a great potential to identify novel multivariate signals from already published univariate results of individual study cohorts. Further, we demonstrated that kernel regression model offers practical benefits for probing novel insights into the mode of action of new drug candidates. Importantly, we predicted and experimentally validated four novel off-targets of an investigational drug tivozanib. Motivated by these results, we extended the model to take advantage of various chemical and genomic information sources simultaneously. In particular, we developed pairwiseMKL, the first time- and memory-efficient method for learning with multiple pairwise kernels constructed using various data sources. pairwiseMKL is well-suited for predictive modelling of both molecular and phenotypic drug response profiles. Finally, we systematically examined transcriptional signatures of Mycobacterium tuberculosis extracted from patients before and during drug therapy, and we demonstrated their power in modelling early treatment efficacy.
Supervising professor
Rousu, Juho, Prof., Aalto University, Department of Computer Science, Finland
Thesis advisor
Aittokallio, Tero, Prof., Institute for Molecular Medicine Finland and University of Turku, Finland
Pirinen, Matti, Dr., Institute for Molecular Medicine Finland, Finland
systems pharmacology, machine learning, genome-wide association studies, protein target, drug-target interaction, cancer cell line, kernel, pairwise learning
Other note
  • [Publication 1]: Anna Cichonska, Juho Rousu, Pekka Marttinen, Antti J. Kangas, Pasi Soininen, Terho Lehtimäki, Olli T. Raitakari, Marjo-Riitta Järvelin, Veikko Salomaa, Mika Ala-Korpela, Samuli Ripatti, Matti Pirinen. metaCCA: Summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics, 32, 13, 1981-1989, February 2016.
    DOI: 10.1093/bioinformatics/btw052 View at publisher
  • [Publication 2]: Isobella Honeyborne, Timothy D. McHugh, Iitu Kuittinen, Anna Cichonska, Dimitrios Evangelopoulos, Katharina Ronacher, Paul D. van Helden, Stephen H. Gillespie, Delmiro Fernandez-Reyes, Gerhard Walzl, Juho Rousu, Philip D. Butcher, Simon J. Waddell. Profiling persistent tubercule bacilli from patient sputa during therapy predicts early drug efficacy. BMC Medicine, 14, 1, 68, April 2016.
    DOI: 10.1186/s12916-016-0609-3 View at publisher
  • [Publication 3]: Anna Cichonska, Juho Rousu, Tero Aittokallio. Identification of drug candidates and repurposing opportunities through compound–target interaction networks. Expert Opinion on Drug Discovery, 10, 12, 1333-1345, October 2015.
    DOI: 10.1517/17460441.2015.1096926 View at publisher
  • [Publication 4]: Anna Cichonska, Balaguru Ravikumar, Elina Parri, Sanna Timonen, Tapio Pahikkala, Antti Airola, Krister Wennerberg, Juho Rousu, Tero Aittokallio. Computational-experimental approach to drug–target interaction mapping: A case study on kinase inhibitors. PLoS Computational Biology, 13, 8, e1005678, August 2017.
    DOI: 10.1371/journal.pcbi.1005678 View at publisher
  • [Publication 5]: Anna Cichonska, Tapio Pahikkala, Sandor Szedmak, Heli Julkunen,Antti Airola, Markus Heinonen, Tero Aittokallio, Juho Rousu. Learning with multiple pairwise kernels for drug bioactivity prediction. Bioinformatics, 34, 13, i509-i518, June 2018.
    DOI: 10.1093/bioinformatics/bty277 View at publisher