Foundations and Advances in Deep Learning

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Raiko, Tapani, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.advisor Ilin, Alexander, Dr., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.author Cho, Kyunghyun
dc.date.accessioned 2014-03-12T10:00:10Z
dc.date.available 2014-03-12T10:00:10Z
dc.date.issued 2014
dc.identifier.isbn 978-952-60-5575-6 (electronic)
dc.identifier.isbn 978-952-60-5574-9 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/12729
dc.description.abstract Deep neural networks have become increasingly popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks have been proposed. They include Hopfield networks, self-organizing maps, neural principal component analysis, Boltzmann machines, multi-layer perceptrons, radial-basis function networks, autoencoders, sigmoid belief networks, support vector machines and deep belief networks. The first part of this thesis investigates shallow and deep neural networks in search of principles that explain why deep neural networks work so well across a range of applications. The thesis starts from some of the earlier ideas and models in the field of artificial neural networks and arrive at autoencoders and Boltzmann machines which are two most widely studied neural networks these days. The author thoroughly discusses how those various neural networks are related to each other and how the principles behind those networks form a foundation for autoencoders and Boltzmann machines. The second part is the collection of the ten recent publications by the author. These publications mainly focus on learning and inference algorithms of Boltzmann machines and autoencoders. Especially, Boltzmann machines, which are known to be difficult to train, have been in the main focus. Throughout several publications the author and the co-authors have devised and proposed a new set of learning algorithms which includes the enhanced gradient, adaptive learning rate and parallel tempering. These algorithms are further applied to a restricted Boltzmann machine with Gaussian visible units. In addition to these algorithms for restricted Boltzmann machines the author proposed a two-stage pretraining algorithm that initializes the parameters of a deep Boltzmann machine to match the variational posterior distribution of a similarly structured deep autoencoder. Finally, deep neural networks are applied to image denoising and speech recognition. en
dc.format.extent 277
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 21/2014
dc.relation.haspart [Publication 1]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient for Training Restricted Boltzmann Machines. Neural Computation, Volume 25 Issue 3 Pages 805–831, March 2013.
dc.relation.haspart [Publication 2]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines. In Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Pages 105–112, June 2011.
dc.relation.haspart [Publication 3]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Parallel Tempering is Efficient for Learning Restricted Boltzmann Machines. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN 2010), Pages 1–8, July 2010.
dc.relation.haspart [Publication 4]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Tikhonov-Type Regularization for Restricted Boltzmann Machines. In Proceedings of the 22nd International Conference on Artificial Neural Networks (ICANN 2012), Pages 81–88, September 2012.
dc.relation.haspart [Publication 5]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines. In Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN 2011), Pages 10–17, June 2011.
dc.relation.haspart [Publication 6]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Gaussian-Bernoulli Deep Boltzmann Machines. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013), August 2013.
dc.relation.haspart [Publication 7]: Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Juha Karhunen. A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 106–113, September 2013.
dc.relation.haspart [Publication 8]: Kyunghyun Cho. Simple Sparsification Improves Sparse Denoising Autoencoders in Denoising Highly Corrupted Images. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013), Pages 432–440, June 2013.
dc.relation.haspart [Publication 9]: Kyunghyun Cho. Boltzmann Machines for Image Denoising. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 611–618, September 2013.
dc.relation.haspart [Publication 10]: Sami Keronen, Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Kalle Palomäki. Gaussian-Bernoulli Restricted Boltzmann Machines and Automatic Feature Extraction for Noise Robust Missing Data Mask Estimation. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Pages 6729–6733, May 2013.
dc.subject.other Computer science en
dc.title Foundations and Advances in Deep Learning en
dc.type G5 Artikkeliväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword deep learning en
dc.subject.keyword neural networks en
dc.subject.keyword multilayer perceptron en
dc.subject.keyword probabilistic model en
dc.subject.keyword restricted Boltzmann machine en
dc.subject.keyword deep Boltzmann machine en
dc.subject.keyword denoising autoencoder en
dc.identifier.urn URN:ISBN:978-952-60-5575-6
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Karhunen, Juha, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.opn de Freitas, Nando, Prof., University of Oxford, United Kingdom
dc.date.dateaccepted 2014-01-07
dc.contributor.lab Deep Learning and Bayesian Modeling en
dc.contributor.lab Syvä oppiminen ja bayesiläinen mallintaminen fi
dc.rev Larochelle, Hugo, Prof., University of Sherbrooke, Canada
dc.rev Bergstra, James, Dr., University of Waterloo, Canada
dc.date.defence 2014-03-21


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse