Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU
Loading...
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Author
Date
2023
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
16
Series
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; Volume 14171 LNAI
Abstract
In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive performance. Fortunately, we found that this can be mitigated by introducing an intermediate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be of constant fan-in, in the sense that each output neuron will have the exact same number of incoming connections, which allows for more efficient implementations, especially on GPU hardware. The CUDA implementation of our approach is provided at https://github.com/xmc-aalto/ecml23-sparse.Description
Publisher Copyright: © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
Other note
Citation
Schultheis, E & Babbar, R 2023, Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU . in D Koutra, C Plant, M Gomez Rodriguez, E Baralis & F Bonchi (eds), Machine Learning and Knowledge Discovery in Databases : Research Track - European Conference, ECML PKDD 2023, Proceedings . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14171 LNAI, Springer, pp. 689-704, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Turin, Italy, 18/09/2023 . https://doi.org/10.1007/978-3-031-43418-1_41