Speeding-up one-versus-all training for extreme classification via mean-separating initialization

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorSchultheis, Eriken_US
dc.contributor.authorBabbar, Rohiten_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Babbar Rohiten
dc.contributor.groupauthorComputer Science Professorsen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML) - Research areaen
dc.date.accessioned2022-11-09T08:02:43Z
dc.date.available2022-11-09T08:02:43Z
dc.date.issued2022-11en_US
dc.description.abstractIn this paper, we show that a simple, data dependent way of setting the initial vector can be used to substantially speed up the training of linear one-versus-all classifiers in extreme multi-label classification (XMC). We discuss the problem of choosing the initial weights from the perspective of three goals. We want to start in a region of weight space (a) with low loss value, (b) that is favourable for second-order optimization, and (c) where the conjugate-gradient (CG) calculations can be performed quickly. For margin losses, such an initialization is achieved by selecting the initial vector such that it separates the mean of all positive (relevant for a label) instances from the mean of all negatives – two quantities that can be calculated quickly for the highly imbalanced binary problems occurring in XMC. We demonstrate a training speedup of up to 5× on Amazon-670K dataset with 670,000 labels. This comes in part from the reduced number of iterations that need to be performed due to starting closer to the solution, and in part from an implicit negative-mining effect that allows to ignore easy negatives in the CG step. Because of the convex nature of the optimization problem, the speedup is achieved without any degradation in classification accuracy. The implementation can be found at https://github.com/xmc-aalto/dismecpp.en
dc.description.versionPeer revieweden
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationSchultheis, E & Babbar, R 2022, 'Speeding-up one-versus-all training for extreme classification via mean-separating initialization', Machine Learning, vol. 111, no. 11, pp. 3953-3976. https://doi.org/10.1007/s10994-022-06228-2en
dc.identifier.doi10.1007/s10994-022-06228-2en_US
dc.identifier.issn1573-0565
dc.identifier.otherPURE UUID: 9ab2daf7-6046-472d-b987-28feff91b375en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/9ab2daf7-6046-472d-b987-28feff91b375en_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/91141091/Speeding_up_one_versus_all_training_for_extreme_classification_via_mean_separating_initialization.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/117680
dc.identifier.urnURN:NBN:fi:aalto-202211096451
dc.language.isoenen
dc.publisherSpringer
dc.relation.ispartofseriesMachine Learningen
dc.relation.ispartofseriesVolume 111, issue 11, pp. 3953-3976en
dc.rightsopenAccessen
dc.subject.keywordLarge-scale multi-label classificationen_US
dc.subject.keywordLinear classificationen_US
dc.subject.keyword2nd order optimizationen_US
dc.subject.keywordClass imbalanceen_US
dc.subject.keywordWeight initializationen_US
dc.titleSpeeding-up one-versus-all training for extreme classification via mean-separating initializationen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Speeding_up_one_versus_all_training_for_extreme_classification_via_mean_separating_initialization.pdf
Size:
5.29 MB
Format:
Adobe Portable Document Format