Foundations and Advances in Deep Learning

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2014-03-21
Checking the digitized thesis and permission for publishing
Instructions for the author
Date
2014
Major/Subject
Mcode
Degree programme
Language
en
Pages
277
Series
Aalto University publication series DOCTORAL DISSERTATIONS, 21/2014
Abstract
Deep neural networks have become increasingly popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks have been proposed. They include Hopfield networks, self-organizing maps, neural principal component analysis, Boltzmann machines, multi-layer perceptrons, radial-basis function networks, autoencoders, sigmoid belief networks, support vector machines and deep belief networks. The first part of this thesis investigates shallow and deep neural networks in search of principles that explain why deep neural networks work so well across a range of applications. The thesis starts from some of the earlier ideas and models in the field of artificial neural networks and arrive at autoencoders and Boltzmann machines which are two most widely studied neural networks these days. The author thoroughly discusses how those various neural networks are related to each other and how the principles behind those networks form a foundation for autoencoders and Boltzmann machines. The second part is the collection of the ten recent publications by the author. These publications mainly focus on learning and inference algorithms of Boltzmann machines and autoencoders. Especially, Boltzmann machines, which are known to be difficult to train, have been in the main focus. Throughout several publications the author and the co-authors have devised and proposed a new set of learning algorithms which includes the enhanced gradient, adaptive learning rate and parallel tempering. These algorithms are further applied to a restricted Boltzmann machine with Gaussian visible units. In addition to these algorithms for restricted Boltzmann machines the author proposed a two-stage pretraining algorithm that initializes the parameters of a deep Boltzmann machine to match the variational posterior distribution of a similarly structured deep autoencoder. Finally, deep neural networks are applied to image denoising and speech recognition.
Description
Supervising professor
Karhunen, Juha, Prof., Aalto University, Department of Information and Computer Science, Finland
Thesis advisor
Raiko, Tapani, Prof., Aalto University, Department of Information and Computer Science, Finland
Ilin, Alexander, Dr., Aalto University, Department of Information and Computer Science, Finland
Keywords
deep learning, neural networks, multilayer perceptron, probabilistic model, restricted Boltzmann machine, deep Boltzmann machine, denoising autoencoder
Other note
Parts
  • [Publication 1]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient for Training Restricted Boltzmann Machines. Neural Computation, Volume 25 Issue 3 Pages 805–831, March 2013.
  • [Publication 2]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines. In Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Pages 105–112, June 2011.
  • [Publication 3]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Parallel Tempering is Efficient for Learning Restricted Boltzmann Machines. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN 2010), Pages 1–8, July 2010.
  • [Publication 4]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Tikhonov-Type Regularization for Restricted Boltzmann Machines. In Proceedings of the 22nd International Conference on Artificial Neural Networks (ICANN 2012), Pages 81–88, September 2012.
  • [Publication 5]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines. In Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN 2011), Pages 10–17, June 2011.
  • [Publication 6]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Gaussian-Bernoulli Deep Boltzmann Machines. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013), August 2013.
  • [Publication 7]: Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Juha Karhunen. A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 106–113, September 2013.
  • [Publication 8]: Kyunghyun Cho. Simple Sparsification Improves Sparse Denoising Autoencoders in Denoising Highly Corrupted Images. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013), Pages 432–440, June 2013.
  • [Publication 9]: Kyunghyun Cho. Boltzmann Machines for Image Denoising. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 611–618, September 2013.
  • [Publication 10]: Sami Keronen, Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Kalle Palomäki. Gaussian-Bernoulli Restricted Boltzmann Machines and Automatic Feature Extraction for Noise Robust Missing Data Mask Estimation. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Pages 6729–6733, May 2013.
Citation