Retrieval of Gene Expression Measurements with Probabilistic Models

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Peltonen, Jaakko, Dr., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.author Faisal, Ali
dc.date.accessioned 2014-07-16T09:00:13Z
dc.date.available 2014-07-16T09:00:13Z
dc.date.issued 2014
dc.identifier.isbn 978-952-60-5781-1 (electronic)
dc.identifier.isbn 978-952-60-5780-4 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/13628
dc.description.abstract A crucial problem in current biological and medical research is how to utilize the diverse set of existing biological knowledge and heterogeneous measurement data in order to gain insights on new data. As datasets continue to be deposited in public repositories it is becoming important to develop search engines that can efficiently integrate existing data and search for relevant earlier studies given a new study. The search task is encountered in several biological applications including cancer genomics, pharmacokinetics, personalized medicine and meta-analysis of functional genomics.  Most existing search engines rely on classical keyword or annotation based retrieval which is limited to discovering known information and requires careful downstream annotation of the data. Data-driven model-based methods, that retrieve studies based on similarities in the actual measurement data, have a greater potential for uncovering novel biological insights. In particular, probabilistic modeling provides promising model-based tools due to its ability to encode prior knowledge, represent uncertainty in model parameters and handle noise associated to the data. By introducing latent variables it is further possible to capture relationships in data features in the form of meaningful biological components underlying the data.  This thesis adapts existing and develops new probabilistic models for retrieval of relevant measurement data in three different cases of background repositories. The first case is a background collection of data samples where each sample is represented by a single data type. The second case is a collection of multimodal data samples where each sample is represented by more than one data type. The third case is a background collection of datasets where each dataset, in turn, is a collection of multiple samples. In all three setups the proposed models are evaluated quantitatively and with case studies the models are demonstrated to facilitate interpretable retrieval of relevant data, rigorous integration of diverse information sources and learning of latent components from partly related dataset collections. en
dc.format.extent 99 + app. 156
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 108/2014
dc.relation.haspart [Publication 1]: José Caldas, Nils Gehlenborg, Ali Faisal, Alvis Brazma and Samuel Kaski. Probabilistic retrieval and visualization of biologically relevant microarray experiments. Bioinformatics, 25(12):i145–i153, 2009. doi:10.1093/bioinformatics/btp215.
dc.relation.haspart [Publication 2]: Ali Faisal, Frank Dondelinger, Dirk Husmeier, Colin M. Beale. Inferring species interaction networks from species abundance data: A comparative evaluation of various statistical and machine learning methods. Ecological Informatics, 5(6):451–464, 2010. doi:10.1016/j.ecoinf.2010.06.005.
dc.relation.haspart [Publication 3]: José Caldas, Nils Gehlenborg, Eeva Kettunen, Ali Faisal, Mikko Rönty, Andrew G. Nicholson, Sakari Knuutila, Alvis Brazma and Samuel Kaski. Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma. Bioinformatics, 28(2):246–253, 2012. doi:10.1093/bioinformatics/btr634.
dc.relation.haspart [Publication 4]: Suleiman A Khan, Ali Faisal, John P. Mpindi, Juuso A. Parkkinen, Tuomo Kalliokoski, Antti Poso, Olli P. Kallioniemi, Krister Wennerberg and Samuel Kaski. Comprehensive data-driven analysis of the impact of chemoinformatic structure on the genome-wide biological response profiles of cancer cells to 1159 drugs. BMC Bioinformatics, 13:112, 2012. doi:10.1186/1471-2105-13-112.
dc.relation.haspart [Publication 5]: Riku Louhimo, Viljami Aittomaki*, Ali Faisal*, Marko Laakso*, Ping Chen, Kristian Ovaska, Erkka Valo, Leo Lahti, Vladimir Rogojin, Samuel Kaski and Sampsa Hautaniemi. Systematic use of computational methods allows stratification of treatment responders in glioblastoma multiforme. Systems Biomedicine, 1(2):130–136, 2013. doi:10.4161/sysb.28904.
dc.relation.haspart [Publication 6]: Ali Faisal, Jussi Gillberg, Gayle Leen and Jaakko Peltonen. Transfer Learning using a Nonparametric Sparse Topic Model. Neurocomputing, 112:124–137, 2013. doi:10.1016/j.neucom.2012.12.038.
dc.relation.haspart [Publication 7]: Ali Faisal, Jaakko Peltonen, Elisabeth Georgii, Johan Rung and Samuel Kaski. Toward computational cumulative biology by combining models of biological datasets. Submitted to a journal, 6 pages, 2013.
dc.subject.other Computer science en
dc.title Retrieval of Gene Expression Measurements with Probabilistic Models en
dc.type G5 Artikkeliväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword machine learning en
dc.subject.keyword bioinformatics en
dc.subject.keyword probabilistic modeling en
dc.subject.keyword information retrieval en
dc.subject.keyword Bayesian generative models en
dc.identifier.urn URN:ISBN:978-952-60-5781-1
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Kaski, Samuel, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.opn Mamitsuka, Hiroshi, Prof., Kyoto University, Gokasho, Japan
dc.date.dateaccepted 2014-06-27
dc.rev Autio, Reija, Dr., Tampere University of Technology, Finland
dc.rev Saez-Rodriguez, Julio, Dr., European Bioinformatics Institute, Cambridge, United Kingdom
dc.date.defence 2014-08-15


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account