Extracting Medical Entities from Radiology Reports with Ontology-based Distant Supervision
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
Department
Mcode
SCI3044
Language
en
Pages
53+7
Series
Abstract
Doctors need to review a substantial amount of medical documents, such as radiology reports, to make medical decisions. Named Entity Recognition~(NER) structuralizes the raw medical text by detecting and classifying medical-related entities. Structuralized documents with medical concepts improve the doctors' work effectiveness and contain medical information benefitting the extraction of important information. Nevertheless, deploying the NER on Finnish medical text is still challenging because of data annotation, in-domain adaptation, label in-completion problem, and label noise. To solve these problems, we develop a NER system called Auto-labeling and Noise-suppressed Network~(ANT). Automated annotation mechanism provides supervised signals for training samples of the NER dataset. Domain continual pretraining transfers in-domain knowledge to the NER model for better model performance. We leverage weak label completion scheme to complete weak labels generated by the automated annotation mechanism. Some noise suppression approaches are applied to further reduce the label noise. Experimental results show that our model has achieved relatively strong performance on a silver standard dataset. We also conduct ablation experiments to explore the effectiveness of our framework's components.Description
Supervisor
Pekka, MarttinenThesis advisor
Miika, KoskinenShaoxiong, Ji