Learning Methods for Variable Selection and Time Series Prediction

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Lendasse, Amaury, Dr., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.advisor Pouzols, Federico Montesino, Dr., University of Helsinki, Finland
dc.contributor.author Sovilj, Dušan
dc.date.accessioned 2014-10-02T09:00:13Z
dc.date.available 2014-10-02T09:00:13Z
dc.date.issued 2014
dc.identifier.isbn 978-952-60-5857-3 (electronic)
dc.identifier.isbn 978-952-60-5856-6 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/14058
dc.description.abstract In the recent years, machine learning methods have become increasingly popular for modelling many different phenomena: financial markets, spatio-temporal data sets, pattern recognition, speech and image processing, recommender systems and many others. This huge interest in machine learning comes from the great success of their application and the increasingly easier acquisition, storage and access of data. In this thesis, two general problems in machine learning are discussed and several solutions are offered. The first problem is variable selection, an approach to automatically select the most relevant features in the data. Two key phases of variable selection are the search criterion and the search algorithm. The thesis focuses on the Delta test as a search criterion, while several solutions are offered for the search algorithm, such as the Genetic Algorithm and Tabu Search. Furthermore, the selection procedure is extended for more general cases of scaling and projection, as well as their combination. Finally, some of the above proposed solutions have been developed for parallel architectures which enable the whole variable selection procedure to be used for data sets with a high number of features. The second problem tackled in the thesis is time series prediction that arises in many fields of science and industry. In simple words: time series prediction involves the estimation of future values for a series of measurements of a/the phenomenon of interest. The number of these estimations can be small, leading to short-term prediction, or several hundreds which constitute long-term prediction. Two models have been developed for this particular task. One is based on a recently popular neural network type called Extreme Learning Machine, while the other is a juxtaposition of Generative Topographic Mapping and Relevance Learning modified for regression tasks. Finally, the above problems are tackled together for real-world time series coming from a biological domain. The difficulty of making any kind of inference in biological time series is due to really small amount of available samples, irregular sampling frequency and spatial coverage of areas of interest. Nevertheless, more stable model parameter estimation is possible with the combined use of global climate indicators and regional measurements in the form of a multifactor approach. en
dc.format.extent 114 + app. 108
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 138/2014
dc.relation.haspart [Publication 1]: Dušan Sovilj, Antti Sorjamaa, Qi Yu, Yoan Miche, Eric Séverin. OPELM and OP-KNN in Long-Term Prediction of Time Series using Projected Input Data. Neurocomputing, 73(10–12):1976–1986, June 2010. DOI: 10.1016/j.neucom.2009.11.033
dc.relation.haspart [Publication 2]: Fernando Mateo, Dušan Sovilj, Rafael Gadea. Approximate k-NN Delta Test Minimization Method using Genetic Algorithms: Application to Time Series. Neurocomputing, 73(10–12):2017–2029, June 2010. DOI: 10.1016/j.neucom.2009.11.032
dc.relation.haspart [Publication 3]: Karin Junker, Dušan Sovilj, Ingrid Kröncke, Joachim Dippner. Climate induced changes in benthic macrofauna – A non-linear model approach. Journal of Marine Systems, 96–97:90–94, August 2012. DOI: 10.1016/j.jmarsys.2012.02.005
dc.relation.haspart [Publication 4]: Dušan Sovilj. Multistart Strategy Using Delta Test for Variable Selection. In International Conference on Artificial Neural Networks (ICANN 2011, Part II), pages 413–420, Lecture Notes in Computer Science volume 6792. Espoo, Finland, June 2011. DOI: 10.1007/978-3-642-21738-8_53
dc.relation.haspart [Publication 5]: Andrej Gisbrecht, Dušan Sovilj, Barbara Hammer, and Amaury Lendasse. Relevance learning for time series inspection. In European Symposium on Artificial Neural Networks (ESANN 2012), pages 489–494, Computational Intelligence and Machine Learning. Bruges, Belgium, April 2012.
dc.relation.haspart [Publication 6]: Dušan Sovilj, Amaury Lendasse, Olli Simula. Extending Extreme 5 Learning Machine with Combination Layer. In International Work-Conference on Artificial Neural Networks, pages 417—426, Lecture Notes in Computer Science volume 7902. Tenerife, Spain, June 2013.
dc.relation.haspart [Publication 7]: Alberto Guillén, Mark van Heeswijk, Dušan Sovilj, M. G. Arenas, Héctor Pomares, and Ignacio Rojas. Variable Selection in a GPU Cluster using Delta Test. In International Work-Conference on Artificial Neural Networks, pages 393–400, Lecture Notes in Computer Science volume 6691. Málaga, Spain, June 2011. DOI: 10.1007/978-3-642-21501-8_49
dc.relation.haspart [Publication 8]: Alberto Guillén, Dušan Sovilj, Mark van Heeswijk, Luis Javier Herrera, Amaury Lendasse, Héctor Pomares, and Ignacio Rojas. Evolutive Approaches for Variable Selection Using a Non-parametric Noise Estimator. Parallel Architectures & Bioinspired Algorithms, Studies in Computational Intelligence volume 415, pages 243–266, August 2012. DOI: 10.1007/978-3-642-28789-3_11
dc.subject.other Computer science en
dc.title Learning Methods for Variable Selection and Time Series Prediction en
dc.type G5 Artikkeliväitöskirja fi
dc.description.version Peer reviewed en
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword variable selection/scaling/projection en
dc.subject.keyword time series prediction en
dc.subject.keyword environmental modelling en
dc.subject.keyword model structure selection en
dc.identifier.urn URN:ISBN:978-952-60-5857-3
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Karhunen, Juha, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.opn Kärkkäinen, Tommi, Prof., University of Jyväskylä, Finland
dc.rev Verleysen, Michel, Prof., Université catholique de Louvain, Belgium
dc.rev Neumann, Klaus, Dr., Bielefeld University, Germany
dc.date.defence 2014-10-31
dc.type.version Final published version en

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account