Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

16

Series

Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2023, Proceedings, pp. 689-704, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; Volume 14171 LNAI

Abstract

In classification problems with large output spaces (up to millions of labels), the last layer can require an enormous amount of memory. Using sparse connectivity would drastically reduce the memory requirements, but as we show below, applied naïvely it can result in much diminished predictive performance. Fortunately, we found that this can be mitigated by introducing an intermediate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be of constant fan-in, in the sense that each output neuron will have the exact same number of incoming connections, which allows for more efficient implementations, especially on GPU hardware. The CUDA implementation of our approach is provided at https://github.com/xmc-aalto/ecml23-sparse.

Description

Publisher Copyright: © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

Other note

Citation

Schultheis, E & Babbar, R 2023, Towards Memory-Efficient Training for Extremely Large Output Spaces : Learning with 670k Labels on a Single Commodity GPU. in D Koutra, C Plant, M Gomez Rodriguez, E Baralis & F Bonchi (eds), Machine Learning and Knowledge Discovery in Databases : Research Track - European Conference, ECML PKDD 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14171 LNAI, Springer, pp. 689-704, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Turin, Italy, 18/09/2023. https://doi.org/10.1007/978-3-031-43418-1_41