Convex Surrogates for Unbiased Loss Functions in Extreme Classification With Missing Labels

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorMohammadnia Qaraei, Mohammadrezaen_US
dc.contributor.authorSchultheis, Eriken_US
dc.contributor.authorGupta, Priyanshuen_US
dc.contributor.authorBabbar, Rohiten_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Babbar Rohiten
dc.contributor.groupauthorComputer Science Professorsen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML)en
dc.contributor.organizationIndian Institute of Technologyen_US
dc.date.accessioned2021-08-09T06:32:01Z
dc.date.available2021-08-09T06:32:01Z
dc.date.issued2021-04-19en_US
dc.description.abstractExtreme Classification (XC) refers to supervised learning where each training/test instance is labeled with small subset of relevant labels that are chosen from a large set of possible target labels. The framework of XC has been widely employed in web applications such as automatic labeling of web-encyclopedia, prediction of related searches, and recommendation systems. While most state-of-the-art models in XC achieve high overall accuracy by performing well on the frequently occurring labels, they perform poorly on a large number of infrequent (tail) labels. This arises from two statistical challenges, (i) missing labels, as it is virtually impossible to manually assign every relevant label to an instance, and (ii) highly imbalanced data distribution where a large fraction of labels are tail labels. In this work, we consider common loss functions that decompose over labels, and calculate unbiased estimates that compensate missing labels according to Natarajan et al. [26]. This turns out to be disadvantageous from an optimization perspective, as important properties such as convexity and lower-boundedness are lost. To circumvent this problem, we use the fact that typical loss functions in XC are convex surrogates of the 0-1 loss, and thus propose to switch to convex surrogates of its unbiased version. These surrogates are further adapted to the label imbalance by combining with label-frequency-based rebalancing. We show that the proposed loss functions can be easily incorporated into various different frameworks for extreme classification. This includes (i) linear classifiers, such as DiSMEC, on sparse input data representation, (ii) attention-based deep architecture, AttentionXML, learnt on dense Glove embeddings, and (iii) XLNet-based transformer model for extreme classification, APLC-XLNet. Our results demonstrate consistent improvements over the respective vanilla baseline models, on the propensity-scored metrics for precision and nDCG.en
dc.description.versionPeer revieweden
dc.format.extent10
dc.format.extent3711-3720
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationMohammadnia Qaraei, M, Schultheis, E, Gupta, P & Babbar, R 2021, Convex Surrogates for Unbiased Loss Functions in Extreme Classification With Missing Labels . in The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021 . ACM, pp. 3711-3720, The Web Conference, Ljubljana, Slovenia, 19/04/2021 . https://doi.org/10.1145/3442381.3450139en
dc.identifier.doi10.1145/3442381.3450139en_US
dc.identifier.isbn978-1-4503-8312-7
dc.identifier.otherPURE UUID: 304ef1c7-8a6d-4952-a587-0a5beb8237c8en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/304ef1c7-8a6d-4952-a587-0a5beb8237c8en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85107962342&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/66279803/Convex_Surrogates_for_Unbiased_Loss_Functions.3442381.3450139.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/109031
dc.identifier.urnURN:NBN:fi:aalto-202108098273
dc.language.isoenen
dc.relation.ispartofThe Web Conferenceen
dc.relation.ispartofseriesWWW '21: Proceedings of the Web Conference 2021en
dc.rightsopenAccessen
dc.titleConvex Surrogates for Unbiased Loss Functions in Extreme Classification With Missing Labelsen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files