Multi-modal Chest X-Ray analysis: classification and report generation using self-supervised learning

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorKumar, Yogesh
dc.contributor.authorBrazzale, Nicola
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.supervisorMarttinen, Pekka
dc.date.accessioned2022-08-28T17:04:33Z
dc.date.available2022-08-28T17:04:33Z
dc.date.issued2022-07-29
dc.description.abstractAutomated medical systems for classification, localization and diagnosis are increasingly being researched and developed. Accurate and automated disease detection is beneficial both to medical personnel, who do not have to perform tedious examinations and to patients, for whom accurate prediction could save their lives. In this work, the models involved in classification and report generation from chest X-rays are studied. Due to the widespread use of the latter, we were able to collect several datasets, which allowed us to employ the self-supervised learning paradigm. This paradigm allows the methods to learn more representative and inherent internal representations for the domain in question. Two different models are used in this project, one for classification and the other for language modelling. The former is pretrained with the BarlowTwins framework, which is fed two modified copies of the same example, and a custom loss function allows learning of internal weights invariant to the applied transformations. The possible improvements that this approach brings are verified by performing a classification task on a reference dataset and compared with the same model which has not been pretrained with the proposed method. Regarding the language model, a pretraining step was performed at the character level on a large text corpus that includes a collection of medical reports. The fine-tuning process is the culmination of this project and involves the merging of the two models, with the former providing meaningful embeddings and the latter transforming these inputs into natural language. We were able to verify that pre-training with BarlowTwins, brings improvements in classification performance, and by pretraining the language model, one is able to generate text with appropriate grammatical and semantic correctness. However, fine-tuning did not bring satisfactory results, making this a starting point for future studies.en
dc.format.extent64
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/116259
dc.identifier.urnURN:NBN:fi:aalto-202208285073
dc.language.isoenen
dc.programmeMaster’s Programme in Computer, Communication and Information Sciencesfi
dc.programme.majorMachine Learning, Data Science and Artificial Intelligencefi
dc.programme.mcodeSCI3044fi
dc.subject.keywordmachine learningen
dc.subject.keywordself-supervised learningen
dc.subject.keywordconvolutional neural network (CNN)en
dc.subject.keywordgenerative pretrained transformer (GPT)en
dc.subject.keywordchest x-raysen
dc.subject.keywordmedical Imagingen
dc.titleMulti-modal Chest X-Ray analysis: classification and report generation using self-supervised learningen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessyes

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
master_Brazzale_Nicola_2022.pdf
Size:
2.06 MB
Format:
Adobe Portable Document Format