Foundations and Advances in Deep Learning

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorRaiko, Tapani, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.advisorIlin, Alexander, Dr., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.authorCho, Kyunghyun
dc.contributor.departmentTietojenkäsittelytieteen laitosfi
dc.contributor.departmentDepartment of Information and Computer Scienceen
dc.contributor.labDeep Learning and Bayesian Modelingen
dc.contributor.labSyvä oppiminen ja bayesiläinen mallintaminenfi
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorKarhunen, Juha, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.date.accessioned2014-03-12T10:00:10Z
dc.date.available2014-03-12T10:00:10Z
dc.date.dateaccepted2014-01-07
dc.date.defence2014-03-21
dc.date.issued2014
dc.description.abstractDeep neural networks have become increasingly popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks have been proposed. They include Hopfield networks, self-organizing maps, neural principal component analysis, Boltzmann machines, multi-layer perceptrons, radial-basis function networks, autoencoders, sigmoid belief networks, support vector machines and deep belief networks. The first part of this thesis investigates shallow and deep neural networks in search of principles that explain why deep neural networks work so well across a range of applications. The thesis starts from some of the earlier ideas and models in the field of artificial neural networks and arrive at autoencoders and Boltzmann machines which are two most widely studied neural networks these days. The author thoroughly discusses how those various neural networks are related to each other and how the principles behind those networks form a foundation for autoencoders and Boltzmann machines. The second part is the collection of the ten recent publications by the author. These publications mainly focus on learning and inference algorithms of Boltzmann machines and autoencoders. Especially, Boltzmann machines, which are known to be difficult to train, have been in the main focus. Throughout several publications the author and the co-authors have devised and proposed a new set of learning algorithms which includes the enhanced gradient, adaptive learning rate and parallel tempering. These algorithms are further applied to a restricted Boltzmann machine with Gaussian visible units. In addition to these algorithms for restricted Boltzmann machines the author proposed a two-stage pretraining algorithm that initializes the parameters of a deep Boltzmann machine to match the variational posterior distribution of a similarly structured deep autoencoder. Finally, deep neural networks are applied to image denoising and speech recognition.en
dc.format.extent277
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-5575-6 (electronic)
dc.identifier.isbn978-952-60-5574-9 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/12729
dc.identifier.urnURN:ISBN:978-952-60-5575-6
dc.language.isoenen
dc.opnde Freitas, Nando, Prof., University of Oxford, United Kingdom
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient for Training Restricted Boltzmann Machines. Neural Computation, Volume 25 Issue 3 Pages 805–831, March 2013.
dc.relation.haspart[Publication 2]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines. In Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Pages 105–112, June 2011.
dc.relation.haspart[Publication 3]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Parallel Tempering is Efficient for Learning Restricted Boltzmann Machines. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN 2010), Pages 1–8, July 2010.
dc.relation.haspart[Publication 4]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Tikhonov-Type Regularization for Restricted Boltzmann Machines. In Proceedings of the 22nd International Conference on Artificial Neural Networks (ICANN 2012), Pages 81–88, September 2012.
dc.relation.haspart[Publication 5]: Kyunghyun Cho, Alexander Ilin and Tapani Raiko. Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines. In Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN 2011), Pages 10–17, June 2011.
dc.relation.haspart[Publication 6]: Kyunghyun Cho, Tapani Raiko and Alexander Ilin. Gaussian-Bernoulli Deep Boltzmann Machines. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013), August 2013.
dc.relation.haspart[Publication 7]: Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Juha Karhunen. A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 106–113, September 2013.
dc.relation.haspart[Publication 8]: Kyunghyun Cho. Simple Sparsification Improves Sparse Denoising Autoencoders in Denoising Highly Corrupted Images. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013), Pages 432–440, June 2013.
dc.relation.haspart[Publication 9]: Kyunghyun Cho. Boltzmann Machines for Image Denoising. In Proceedings of the 23rd International Conference on Artificial Neural Networks (ICANN 2013), Pages 611–618, September 2013.
dc.relation.haspart[Publication 10]: Sami Keronen, Kyunghyun Cho, Tapani Raiko, Alexander Ilin and Kalle Palomäki. Gaussian-Bernoulli Restricted Boltzmann Machines and Automatic Feature Extraction for Noise Robust Missing Data Mask Estimation. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Pages 6729–6733, May 2013.
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries21/2014
dc.revLarochelle, Hugo, Prof., University of Sherbrooke, Canada
dc.revBergstra, James, Dr., University of Waterloo, Canada
dc.subject.keyworddeep learningen
dc.subject.keywordneural networksen
dc.subject.keywordmultilayer perceptronen
dc.subject.keywordprobabilistic modelen
dc.subject.keywordrestricted Boltzmann machineen
dc.subject.keyworddeep Boltzmann machineen
dc.subject.keyworddenoising autoencoderen
dc.subject.otherComputer scienceen
dc.titleFoundations and Advances in Deep Learningen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.digiauthask
local.aalto.digifolderAalto_67723
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
isbn9789526055756.pdf
Size:
897.1 KB
Format:
Adobe Portable Document Format