Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.author | Attieh, Joseph | en_US |
dc.contributor.author | Zewoudie, Abraham | en_US |
dc.contributor.author | Vlassov, Vladimir | en_US |
dc.contributor.author | Flanagan, Adrian | en_US |
dc.contributor.author | Bäckström, Tom | en_US |
dc.contributor.department | Department of Computer Science | en |
dc.contributor.department | Department of Communications and Networking | en |
dc.contributor.department | Department of Signal Processing and Acoustics | en |
dc.contributor.department | Department of Information and Communications Engineering | en |
dc.contributor.editor | Fink, Gernot A. | en_US |
dc.contributor.editor | Jain, Rajiv | en_US |
dc.contributor.editor | Kise, Koichi | en_US |
dc.contributor.editor | Zanibbi, Richard | en_US |
dc.contributor.groupauthor | Speech Interaction Technology | en |
dc.contributor.organization | Department of Computer Science | en_US |
dc.contributor.organization | KTH Royal Institute of Technology | en_US |
dc.contributor.organization | Huawei Technologies | en_US |
dc.date.accessioned | 2023-08-23T06:08:40Z | |
dc.date.available | 2023-08-23T06:08:40Z | |
dc.date.embargo | info:eu-repo/date/embargoEnd/2024-08-19 | en_US |
dc.date.issued | 2023-08-19 | en_US |
dc.description.abstract | Recent studies show that the spatial distribution of the sentence representations generated from pre-trained language models is highly anisotropic. This results in a degradation in the performance of the models on the downstream task. Most methods improve the isotropy of the sentence embeddings by refining the corresponding contextual word representations, then deriving the sentence embeddings from these refined representations. In this study, we propose to improve the quality of the sentence embeddings extracted from the [CLS] token of the pretrained language models by improving the isotropy of the embeddings. We add one feed-forward layer between the model and the downstream task layers, and we train it using a novel joint loss function. The proposed approach results in embeddings with better isotropy, that generalize better on the downstream task. Experimental results on 3 GLUE datasets with classification as the downstream task show that our proposed method is on par with the state-of-the-art, as it achieves performance gains of around 2–3% on the downstream tasks compared to the baseline. | en |
dc.description.version | Peer reviewed | en |
dc.format.extent | 16 | |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Attieh, J, Zewoudie, A, Vlassov, V, Flanagan, A & Bäckström, T 2023, Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function . in G A Fink, R Jain, K Kise & R Zanibbi (eds), Document Analysis and Recognition – ICDAR 2023 - 17th International Conference, Proceedings . Lecture notes in computer science, Springer, pp. 121-136, International Conference on Document Analysis and Recognition, San Jose, California, United States, 21/08/2023 . https://doi.org/10.1007/978-3-031-41734-4_8 | en |
dc.identifier.doi | 10.1007/978-3-031-41734-4_8 | en_US |
dc.identifier.isbn | 978-3-031-41734-4 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.issn | 1611-3349 | |
dc.identifier.other | PURE UUID: 9e4c72f1-3ffc-4dfa-9f43-644941a65869 | en_US |
dc.identifier.other | PURE ITEMURL: https://research.aalto.fi/en/publications/9e4c72f1-3ffc-4dfa-9f43-644941a65869 | en_US |
dc.identifier.other | PURE LINK: http://www.scopus.com/inward/record.url?scp=85173582381&partnerID=8YFLogxK | en_US |
dc.identifier.other | PURE FILEURL: https://research.aalto.fi/files/112050557/3475.pdf | en_US |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/122658 | |
dc.identifier.urn | URN:NBN:fi:aalto-202308235004 | |
dc.language.iso | en | en |
dc.publisher | Springer | |
dc.relation.ispartof | International Conference on Document Analysis and Recognition | en |
dc.relation.ispartofseries | 17th International Conference on Document Analysis and Recognition (ICDAR 2023) | en |
dc.relation.ispartofseries | Lecture notes in computer science | en |
dc.rights | openAccess | en |
dc.title | Optimizing the Performance of Text Classification Models by Improving the Isotropy of the Embeddings using a Joint Loss Function | en |
dc.type | A4 Artikkeli konferenssijulkaisussa | fi |