Data integration, pathway analysis and mining for systems biology

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Informaatio- ja luonnontieteiden tiedekunta | Doctoral thesis (article-based)
Checking the digitized thesis and permission for publishing
Instructions for the author
Degree programme
Verkkokirja (1336 KB, 62 s.)
VTT publications, 732
Post-genomic molecular biology embodies high-throughput experimental techniques and hence is a data-rich field. The goal of this thesis is to develop bioinformatics methods to utilise publicly available data in order to produce knowledge and to aid mining of newly generated data. As an example of knowledge or hypothesis generation, consider function prediction of biological molecules. Assignment of protein function is a non-trivial task owing to the fact that the same protein may be involved in different biological processes, depending on the state of the biological system and protein localisation. The function of a gene or a gene product may be provided as a textual description in a gene or protein annotation database. Such textual descriptions lack in providing the contextual meaning of the gene function. Therefore, we need ways to represent the meaning in a formal way. Here we apply data integration approach to provide rich representation that enables context-sensitive mining of biological data in terms of integrated networks and conceptual spaces. Context-sensitive gene function annotation follows naturally from this framework, as a particular application. Next, knowledge that is already publicly available can be used to aid mining of new experimental data. We developed an integrative bioinformatics method that utilises publicly available knowledge of protein-protein interactions, metabolic networks and transcriptional regulatory networks to analyse transcriptomics data and predict altered biological processes. We applied this method to a study of dynamic response of Saccharomyces cerevisiae to oxidative stress. The application of our method revealed dynamically altered biological functions in response to oxidative stress, which were validated by comprehensive in vivo metabolomics experiments. The results provided in this thesis indicate that integration of heterogeneous biological data facilitates advanced mining of the data. The methods can be applied for gaining insight into functions of genes, gene products and other molecules, as well as for offering functional interpretation to transcriptomics and metabolomics experiments.
Supervising professor
Kaski, Kimmo, Prof.
Thesis advisor
Oresic, Matej, Research Prof., VTT
systems biology, high-throughput data, data integration, data mining, visualisation, bioinformatics, conceptual spaces, network topology
Other note
  • [Publication 1]: Peddinti V. Gopalacharyulu, Erno Lindfors, Catherine Bounsaythip, Teemu Kivioja, Laxman Yetukuri, Jaakko Hollmén, and Matej Orešič. Data integration and visualization system for enabling conceptual biology. Bioinformatics, 21 Suppl 1:i177-i185, Jun 2005. © 2005 by authors.
  • [Publication 2]: Peddinti V. Gopalacharyulu, Erno Lindfors, Jarkko Miettinen, Catherine Bounsaythip, and Matej Orešič. An integrative approach for biological data mining and visualisation. Int. J. Data mining and Bioinformatics, 2(1):54-77, Jan 2008. © 2008 Inderscience Enterprises. By permission.
  • [Publication 3]: Peddinti V. Gopalacharyulu, Erno Lindfors, Catherine Bounsaythip, and Matej Orešič. Context dependent visualization of protein function. In Juho Rousu, Samuel Kaski, and Esko Ukkonen, editors, Probabilistic Modeling and Machine Learning in Structural and Systems Biology, pages 26-31, Tuusula, Finland, Jun 2006. © 2006 by authors.
  • [Publication 4]: Catherine Bounsaythip, Erno Lindfors, Peddinti V. Gopalacharyulu, Jaakko Hollmén, and Matej Orešič. Network-based representation of biological data for enabling context-based mining. In Catherine Bounsaythip, Jaakko Hollmén, Samuel Kaski, and Matej Orešič, editors, Proceedings of KRBIO'05, International Symposium on Knowledge Representation in Bioinformatics, pages 1-6, Espoo, Finland, Jun 2005. Helsinki University of Technology, Laboratory of Computer and Information Science. © 2005 by authors.
  • [Publication 5]: Peddinti V. Gopalacharyulu, Vidya R. Velagapudi, Erno Lindfors, Eran Halperin, and Matej Orešič. Dynamic network topology changes in functional modules predict responses to oxidative stress in yeast. Mol. BioSyst., 5:276-287, 2009.