Algorithms for Data-Efficient Training of Deep Neural Networks

dc.contributorAalto Universityen
dc.contributor.advisorBengio, Yoshua, Prof., Mila - Quebec AI Institute (Mila - Institut québécois d'intelligence artificielle), Canada
dc.contributor.advisorRaiko, Tapani, Prof., Aalto University, Department of Computer Science, Finland
dc.contributor.advisorKarhunen, Juha, Prof., Aalto University, Department of Computer Science, Finland
dc.contributor.authorVerma, Vikas
dc.contributor.departmentTietotekniikan laitosfi
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.labDeep Learningen
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorKannala, Juho, Prof., Aalto University, Department of Computer Science, Finland
dc.description.abstractDeep Neural Networks ("deep learning") have become a ubiquitous choice of algorithms for Machine Learning applications. These systems often achieve human-level or even super-human level performances across a variety of tasks such as computer vision, natural language processing, speech recognition, reinforcement learning, generative modeling and healthcare. This success can be attributed to their ability to learn complex representations directly from the raw input data, completely eliminating the hand-crafted feature extraction from the pipeline. However, there still exists a caveat: due to the extremely large number of trainable parameters in Deep Neural Networks, their generalization ability depends heavily on the availability of a large amount of labeled data. In many machine learning applications, gathering a large amount of labeled data is not feasible due to privacy, cost, time or expertise constraints. Examples of such applications are abundant in healthcare; for example, predicting the effect of a medicine on a new patient in the scenario where the medicine has been administered to only a few patients earlier. This thesis addresses the problem of improving the generalization ability of Deep Neural Networks using a limited amount of labeled data. More specifically, this thesis explores a class of methods that directly incorporates the inductive bias about how the Deep Neural Networks should "behave" in-between the training samples (both in the input space as well as the hidden space) into the learning algorithms. Throughout several publications included in this thesis, the author has demonstrated that such kinds of methods can outperform conventional baseline methods and achieve state-of-the-art performance across supervised, unsupervised, semi-supervised, adversarial training and graph-based learning settings. In addition to these algorithms, the author proposes a mutual information based method for learning the representations for the "graph-level" tasks in an unsupervised and semi-supervised manner. Finally, the author proposes a method to improve the generalization of ResNets based on the iterative inference view.en
dc.format.extent97 + app. 131
dc.identifier.isbn978-952-64-0160-7 (electronic)
dc.identifier.isbn978-952-64-0159-1 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.opnProf. José Miguel Hernández-Lobato, University of Cambridge, United Kingdom
dc.publisherAalto Universityen
dc.relation.haspart[Publication 1]: Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, Yoshua Bengio. Manifold Mixup: Better Representations by Interpolating Hidden States. In Proceedings of the 36th International Conference on Machine Learning (ICML 2019), Long Beach, California, USA, volume 97, pages: 6438–6447, 2019. Full text in Acris/Aaltodoc:
dc.relation.haspart[Publication 2]: Christopher Beckham, Sina Honari, Vikas Verma, Alex Lamb, Farnoosh Ghadiri, R Devon Hjelm, Yoshua Bengio. On Adversarial Mixup Resynthesis. In 2019 Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, pages:4346–4357, 2019.
dc.relation.haspart[Publication 3]: Vikas Verma, Alex Lamb, Juho Kannala, Yoshua Bengio. Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security (AISec’19), London, United Kingdom, pages:95-103, 2019. Full text in Acris/Aaltodoc: DOI: 10.1145/3338501.3357369
dc.relation.haspart[Publication 4]: Vikas Verma, Alex Lamb, Juho Kannala, Yoshua Bengio, David Lopez-Paz. Interpolation Consistency Training for Semi-Supervised Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Macao, China, pages:3635–3641, 2019. DOI: 10.24963/ijcai.2019/504
dc.relation.haspart[Publication 5]: Vikas Verma, Meng Qu, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang. GraphMix: Improved Training of Graph Neural Networks for Semi-Supervised Learning. Submitted for review,, January 2020, 8 pages
dc.relation.haspart[Publication 6]: Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In Eighth International Conference on Learning Representations (ICLR 2020, spotlight), Addis Ababa, Ethiopia, 2020
dc.relation.haspart[Publication 7]: Stanislaw Jastrzebski, Devansh Arpit, Nicolas Ballas, Vikas Verma, Tong Che, Yoshua Bengio. Residual Connections Encourage Iterative Inference. In 6th International Conference on Learning Representations (ICLR 2018), Vancouver, Canada, 2018
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.revProf. José Miguel Hernández-Lobato, University of Cambridge, United KingdomProf. Ben Glocker, Imperial Colloge of London, United Kingdom
dc.subject.keyworddeep neural networksen
dc.subject.keywordmachine learningen
dc.subject.otherComputer scienceen
dc.titleAlgorithms for Data-Efficient Training of Deep Neural Networksen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.acrisexportstatuschecked 2020-12-28_2033
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
2.21 MB
Adobe Portable Document Format