Clustering Nursing Sentences-Comparing Three Sentence Embedding Methods
Loading...
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2022-05-25
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
854-858
854-858
Series
Challenges of Trustable AI and Added-Value on Health - Proceedings of MIE 2022, Studies in Health Technology and Informatics, Volume 294
Abstract
In health sciences, high-quality text embeddings may augment qualitative data analysis of large amounts of text by enabling, e.g., searching and clustering of health information. This study aimed to evaluate three different sentence-level embedding methods in clustering sentences in nursing narratives from individual patients' hospital care episodes. Two of these embeddings are generated from language models based on the BERT framework, and the third on the Sent2Vec method. These embedding methods were used to cluster sentences from 20 patient care episodes and the results were manually evaluated. Findings suggest that the best clusters were produced by the embeddings from a BERT model fine-tuned for the proxy task of predicting subject headings for nursing text.Description
Funding Information: Acknowledgements: This work was supported by the Academy of Finland (grants 315376, 336033, 315896), Business Finland (grant 884/31/2018), and EU H2020 (grant 101016775). We thank Jari Björne for helping with fine-tuning the BERT model. Publisher Copyright: © 2022 European Federation for Medical Informatics (EFMI) and IOS Press. | openaire: EC/H2020/101016775/EU//INTERVENE
Keywords
electronic health records, natural language processing, nursing documentation, sentence embeddings, Text clustering
Other note
Citation
Moen, H, Suhonen, H, Salanterä, S, Salakoski, T & Peltonen, L M 2022, Clustering Nursing Sentences-Comparing Three Sentence Embedding Methods . in B Seroussi, P Weber, F Dhombres, C Grouin, J-D Liebe, J-D Liebe, J-D Liebe, S Pelayo, A Pinna, B Rance, B Rance, L Sacchi, A Ugon, A Ugon, A Benis & P Gallos (eds), Challenges of Trustable AI and Added-Value on Health - Proceedings of MIE 2022 . Studies in Health Technology and Informatics, vol. 294, IOS Press, pp. 854-858, Medical Informatics Europe Conference, Nice, France, 27/05/2022 . https://doi.org/10.3233/SHTI220606