Data exploration with self-organizing maps in environmental informatics and bioinformatics

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en Kolehmainen, Mikko T. 2012-02-13T12:37:49Z 2012-02-13T12:37:49Z 2004-02-27
dc.identifier.isbn 951-27-0000-X
dc.identifier.issn 1235-0486
dc.description.abstract The aim of this thesis was to evaluate the usability of self-organizing maps and some other methods of computational intelligence in analysing and modelling problems of environmental informatics and bioinformatics. The concepts of environmental informatics, bioinformatics, computational intelligence and data mining are first defined. There follows an introduction to the data processing chain of knowledge discovery and the methods used in this thesis, namely linear regression, self-organizing maps (SOM), Sammon's mapping, U-matrix representation, fuzzy logic, c-means and fuzzy c-means clustering, multi-layer perceptron (MLP), and regularization and Bayesian techniques. The challenges posed by environmental processes and bioprocesses are then identified, including missing data problems, complex lagged dependencies among variables, non-linear chaotic dynamics, ill-defined inverse problems, and large search space in optimization tasks. The works included in this thesis are then evaluated and discussed. The results show that the combination of SOM and Sammon's mapping has great potential in data exploration, and can be used to reveal important features of the measurement techniques (e.g. separability of compounds), reveal new information about already studied phenomena, speed up research work, act as a hypothesis generator for traditional research, and supply clear and intuitive visualization of the environmental phenomenon studied. The results of regression studies show, as expected, that the MLP network yields better estimates in predicting future values of airborne pollutant concentration of NO2 compared with SOM based regression or the least squares approach using periodic components. Additionally, the use of local MLP models is shown to be slightly better for estimating future values of episodes compared with one MLP model only. However, it can be concluded in general that the architectural issues tested are not able to solve solely model performance problems. Finally, recommendations for future work are laid out. Firstly, the data exploration solution should be enhanced with methods from signal processing to enable the handling of measurements with different time scale and lagged multivariate time-series. The main suggestion, however, is to create an integrated environment for testing different hybrid schemes of computational intelligence for better time-series forecasting in environmental informatics and bioinformatics. en
dc.format.extent 73, [60]
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher Helsinki University of Technology en
dc.publisher Teknillinen korkeakoulu fi
dc.relation.ispartofseries Kuopio University publications. C, Natural and environmental sciences en
dc.relation.ispartofseries Kuopion yliopiston julkaisuja. C, Luonnontieteet ja ympäristötieteet fi
dc.relation.ispartofseries 167 en
dc.relation.haspart Kolehmainen M., Martikainen H., Hiltunen T. and Ruuskanen J., 2000. Forecasting air quality parameters using hybrid neural network modelling. Environmental Monitoring and Assessment 65, number 1-2, pages 277-286.
dc.relation.haspart Kolehmainen M., Martikainen H. and Ruuskanen J., 2001. Neural networks and periodic components used in air quality forecasting. Atmospheric Environment 35, number 5, pages 815-825.
dc.relation.haspart Kolehmainen M., Rissanen E., Raatikainen O. and Ruuskanen J., 2001. Monitoring odorous sulfur emissions using self-organizing maps for handling ion mobility spectrometry data. Journal of Air and Waste Management 51, pages 966-971.
dc.relation.haspart Kolehmainen M., Rönkkö P. and Raatikainen O., 2003. Monitoring of yeast fermentation by ion mobility spectrometry measurement and data visualisation with Self-Organizing Maps. Analytica Chimica Acta 484, number 1, pages 93-100.
dc.relation.haspart Niska H., Hiltunen T., Kolehmainen M. and Ruuskanen J., 2003. Hybrid models for forecasting air pollution episodes. International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA'03). University Technical Institute of Roanne, France, 23-25 April 2003. Wien, Springer-Verlag, pages 80-84.
dc.relation.haspart Törönen P., Kolehmainen M., Wong G. and Castrén E., 1999. Analysis of gene expression data using self-organizing maps. Federation of European Biochemical Societies (FEBS) Letters 451, number 2, pages 142-146.
dc.relation.haspart Valkonen V.-P., Kolehmainen M., Lakka H.-M. and Salonen J., 2002. Insulin resistance syndrome revisited: application of self-organizing maps. International Journal of Epidemiology 31, number 4, pages 864-871.
dc.subject.other Computer science en
dc.subject.other Biotechnology en
dc.subject.other Environmental science en
dc.title Data exploration with self-organizing maps in environmental informatics and bioinformatics en
dc.type G5 Artikkeliväitöskirja fi
dc.description.version reviewed en
dc.contributor.department Department of Computer Science and Engineering en
dc.contributor.department Tietotekniikan osasto fi
dc.subject.keyword environmental science computing en
dc.subject.keyword biology computing en
dc.subject.keyword data analysis en
dc.subject.keyword data mining en
dc.subject.keyword knowledge acquisition en
dc.subject.keyword self-organising feature maps en
dc.subject.keyword neural nets en
dc.identifier.urn urn:nbn:fi:tkk-003356
dc.type.dcmitype text en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.type.ontasot Doctoral dissertation (article-based) en
dc.contributor.lab Laboratory of Computer and Information Science en
dc.contributor.lab Informaatiotekniikan laboratorio fi

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account