[dipl] Perustieteiden korkeakoulu / SCI
Permanent URI for this collectionhttps://aaltodoc.aalto.fi/handle/123456789/21
Browse
Browsing [dipl] Perustieteiden korkeakoulu / SCI by Subject "0-1 data"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- Mixture modelling of multiresolution 0-1 data
School of Science | Master's thesis(2010) Adhikari, Prem RajBiological systems are complex and measurements in biology are made with high throughput and high resolution techniques often resulting in data in multiple resolutions. Furthermore, ISCN [1] has defined five different resolutions of the chromosome band. Currently, available standard algorithms can only handle data in one resolution at a time. Hence, transformation of the data to the same resolution is inevitable before the data can be fed to the algorithm. Furthermore comparing the results of an algorithm on data in different resolutions can produce interesting results which aids in determining suitable resolution of data. In addition, experiments in different, resolutions can be helpful in determining the appropriate resolution for computational methods. In this thesis, one method for up sampling and three different methods of down sampling 0-1 data are proposed, implemented and experiments are performed on different resolutions. Suitability of the proposed methods is validated and the results are compared across different resolutions. The proposed methods produce plausible results showing that the significant patterns in the data are retained in the transformed resolution. Thereafter, the mixture models are trained on the data original data and the results are analyzed. However, machine learning methods such as mixture models require high amounts of data to produce plausible results. Therefore, the major aim of the data transformation procedure was the integration of databases. Hence, two different datasets available in two different resolutions were integrated after transforming them to a single resolution and mixture models were trained on them. Trained models can be used to classify cancers and cluster the data. The results on integrated data showed significant improvements compared with the data in the original resolution.