Biclustering Methods: Biological Relevance and Application in Gene Expression Analysis

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en Oghabian, Ali Kilpinen, Sami Hautaniemi, Sampsa Czeizler, Elena 2017-05-11T06:53:34Z 2017-05-11T06:53:34Z 2014
dc.identifier.citation Oghabian , A , Kilpinen , S , Hautaniemi , S & Czeizler , E 2014 , ' Biclustering Methods: Biological Relevance and Application in Gene Expression Analysis ' PLOS ONE , vol 9 , no. 3 , e90801 , pp. 1-10 . DOI: 10.1371/journal.pone.0090801 en
dc.identifier.issn 1932-6203
dc.identifier.other PURE UUID: 364bae59-d7eb-4c10-aa65-53d6875f2e3f
dc.identifier.other PURE ITEMURL:
dc.identifier.other PURE FILEURL:
dc.description.abstract DNA microarray technologies are used extensively to profile the expression levels of thousands of genes under various conditions, yielding extremely large data-matrices. Thus, analyzing this information and extracting biologically relevant knowledge becomes a considerable challenge. A classical approach for tackling this challenge is to use clustering (also known as one-way clustering) methods where genes (or respectively samples) are grouped together based on the similarity of their expression profiles across the set of all samples (or respectively genes). An alternative approach is to develop biclustering methods to identify local patterns in the data. These methods extract subgroups of genes that are co-expressed across only a subset of samples and may feature important biological or medical implications. In this study we evaluate 13 biclustering and 2 clustering (k-means and hierarchical) methods. We use several approaches to compare their performance on two real gene expression data sets. For this purpose we apply four evaluation measures in our analysis: (1) we examine how well the considered (bi)clustering methods differentiate various sample types; (2) we evaluate how well the groups of genes discovered by the (bi)clustering methods are annotated with similar Gene Ontology categories; (3) we evaluate the capability of the methods to differentiate genes that are known to be specific to the particular sample types we study and (4) we compare the running time of the algorithms. In the end, we conclude that as long as the samples are well defined and annotated, the contamination of the samples is limited, and the samples are well replicated, biclustering methods such as Plaid and SAMBA are useful for discovering relevant subsets of genes and samples. en
dc.format.extent 1-10
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartofseries PLOS ONE en
dc.relation.ispartofseries Volume 9, issue 3 en
dc.rights openAccess en
dc.title Biclustering Methods: Biological Relevance and Application in Gene Expression Analysis en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Tietotekniikan laitos
dc.subject.keyword biclustering methods
dc.subject.keyword Gene expression analysis
dc.subject.keyword gene-based benchmarks
dc.subject.keyword performance comparison methods;
dc.subject.keyword quality evaluation benchmarks
dc.subject.keyword running time
dc.subject.keyword sample differentiation
dc.subject.keyword sample-based benchmarks
dc.subject.keyword unsupervised machine learning
dc.identifier.urn URN:NBN:fi:aalto-201705113857
dc.identifier.doi 10.1371/journal.pone.0090801
dc.type.version publishedVersion

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account