Bayesian multi-view models for data-driven drug response analysis

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorKhan, Suleiman Ali
dc.contributor.departmentTietotekniikan laitosfi
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.labStatistical Machine Learning and Bioinformatics Groupen
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorKaski, Samuel, Prof., Aalto University, Department of Computer Science, Finland
dc.date.accessioned2015-08-20T09:01:12Z
dc.date.available2015-08-20T09:01:12Z
dc.date.defence2015-09-07
dc.date.issued2015
dc.description.abstractA central challenge faced by biological and medical research is to understand the impact of chemical entities on living cells. Identifying the relationships between the chemical structures and their cellular responses is valuable for improving drug design and targeted therapies. The chemical structures and their detailed molecular responses need to be combined through a systematic analysis to learn the complex dependencies, which can then assist in improving understanding of the molecular mechanisms of drugs as well as predictions on the effects of unknown molecules. Moreover, with emerging drug-response data sets being profiled over several disease types and phenotypic details, it is pertinent to develop advanced computational methods that can be used to study multiple sets of data together. In this thesis, a novel multi-disciplinary challenge is undertaken for computationally analyzing interactions between multiple biological responses and chemical properties of drugs, while simultaneously advancing the computational methods to better learn these interactions. Specifically, multi-view dependency modeling of paired data sets is formulated as a means of systematically studying the drug-response relationships. First, the systematic analysis of drug structures and their genome-wide responses is presented as a multi-set dependency modeling problem and established methods are adopted to test the novel hypothesis. Several novel extensions of the drug-response analysis are then presented that explore responses measured over multiple disease types and multiple levels of phenotypic detail, uncovering novel biological insights of potential impact. These analyses are made possible by novel advancements in multi-view methods. Specifically, the first Bayesian tensor canonical correlation analysis and its extensions are introduced to capture the underlying multi-way structure and applied in analyzing novel toxicogenomic interactions. The results illustrate that modeling the precise multi-view and multi-way formulation of the data is valuable for discovering interpretable latent components as well as for the prediction of unseen responses of drugs. Therefore, the original contribution to knowledge in this dissertation is two-fold: first, the data-driven identification of relationships between structural properties of drugs and their genome-wide responses in cells and, second, novel advancements of multi-view methods that find dependencies between paired data sets. Open source implementations of the new methods have been released to facilitate further research.en
dc.format.extent79 + app. 99
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-6310-2 (electronic)
dc.identifier.isbn978-952-60-6309-6 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/17490
dc.identifier.urnURN:ISBN:978-952-60-6310-2
dc.language.isoenen
dc.opnMostafavi, Sara, Asst. Prof., University of British Columbia, Canada
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Suleiman A Khan, Ali Faisal, John P Mpindi, Juuso A Parkkinen, Tuomo Kalliokoski, Antti Poso, Olli P Kallioniemi, Krister Wennerberg and Samuel Kaski. Comprehensive data-driven analysis of the impact of chemoinformatic structure on the genome-wide biological response profiles of cancer cells to 1159 drugs. BMC Bioinformatics, 13:112, 2012. DOI:10.1186/1471-2105-13-112
dc.relation.haspart[Publication 2]: Seppo Virtanen, Arto Klami, Suleiman A Khan and Samuel Kaski. Bayesian Group Factor Analysis. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics AISTATS, JMLR W&CP, 22:1269–1277, 2012.
dc.relation.haspart[Publication 3]: Suleiman A Khan, Seppo Virtanen, Olli P Kallioniemi, Krister Wennerberg, Antti Poso and Samuel Kaski. Identification of structural features in chemicals associated with cancer drug response: A systematic data-driven analysis. In Proceedings of the Thirteenth European Conference on Computational Biology ECCB, Bioinformatics, 30:i497–i504, 2014. DOI:10.1093/bioinformatics/btu456
dc.relation.haspart[Publication 4]: Mehmet Gonen, Suleiman A Khan and Samuel Kaski. Kernelized Bayesian Matrix Factorization. In Proceedings of the Twenty-Ninth International Conference on Machine Learning ICML, JMLR W&CP, 28:864–872, 2012.
dc.relation.haspart[Publication 5]: Suleiman A Khan and Samuel Kaski. Bayesian Multi-View Tensor Factorization. In Proceedings of the Seventh European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ECML PKDD, editors T. Calders et al., Springer-Verlag Berlin Heidelberg, 8724:656-671, 2014.
dc.relation.haspart[Publication 6]: Suleiman A Khan, Eemeli Leppaaho and Samuel Kaski. Multi-Tensor Factorization. Submitted to a journal, 23 pages, 2015. 
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries105/2015
dc.revHautaniemi, Sampsa, Prof., University of Helsinki, Finland
dc.revClaassen, Manfred, Prof., Institute of Molecular Systems Biology, ETH Zurich, Switzerland
dc.subject.keywordBayesian modelingen
dc.subject.keywordmachine learningen
dc.subject.keywordmulti-view learningen
dc.subject.keywordcomputational biologyen
dc.subject.keywordbioinformaticsen
dc.subject.keywordtoxicogenomicsen
dc.subject.keywordlatent variable modelsen
dc.subject.keywordBayesian tensor CCAen
dc.subject.otherBiotechnologyen
dc.subject.otherComputer scienceen
dc.titleBayesian multi-view models for data-driven drug response analysisen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.archiveyes
local.aalto.digiauthask
local.aalto.digifolderAalto_64549
local.aalto.formfolder2015_08_19_klo_14_46
Files
Original bundle
Now showing 1 - 5 of 5
No Thumbnail Available
Name:
isbn9789526063102.pdf
Size:
3.33 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
article1.pdf
Size:
2.97 MB
Format:
Adobe Portable Document Format
Description:
publishers version
No Thumbnail Available
Name:
article2.pdf
Size:
392.89 KB
Format:
Adobe Portable Document Format
Description:
publishers version
No Thumbnail Available
Name:
article3.pdf
Size:
72.65 MB
Format:
Adobe Portable Document Format
Description:
publishers version
No Thumbnail Available
Name:
article4.pdf
Size:
392.1 KB
Format:
Adobe Portable Document Format
Description:
publishers version