Natural Language Processing for Healthcare: Text Representation, Multitask Learning, and Applications

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2023-03-22
Degree programme
56 + app. 135
Aalto University publication series DOCTORAL THESES, 11/2023
The emergence of deep learning algorithms in natural language processing has boosted the development of intelligent medical information systems. Firstly, this dissertation explores effective text encoding for clinical text. We propose a dilated convolutional attention network with dilated convolutions to capture complex medical patterns in long clinical notes by exponentially increasing the receptive field with the dilation size. Furthermore, we propose to utilize embedding injection and gated information propagation in the medical note encoding module for better representation learning of the lengthy clinical text. To capture the interaction between notes and codes, we explicitly model the underlying dependency between notes and codes and utilize textual descriptions of medical codes as external knowledge. We also adopt the contextualized graph embeddings to learn contextual information and causal relationships between text mentions such as drugs taken and adverse reactions. We also conduct an empirical analysis on the effectiveness of transfer learning with language model pretraining to clinical text encoding and medical code prediction. We develop a hierarchical encoding model to equip the pretrained language models with the capacity to encode long clinical notes. We further study the effect of pretraining in different domains and with different strategies. The comprehensive quantitative analysis shows that hierarchical encoding can capture interactions between distant words to some extent. Then, this dissertation investigates the multitask learning paradigm and its applications to healthcare. Multitask learning, motivated by human learning from previous tasks to help with a new task, makes full use of the information contained in each task and shares information between related tasks through common parameters. We adopt multitask learning for medical code prediction and demonstrate the benefits of leveraging multiple coding schemes. We design a recalibrated aggregation module to generate clinical document features with better quality and less noise in the shared modules of multitask networks. Finally, we consider the task context to improve multitask learning for healthcare. We propose to use a domain-adaptive pretrained model and hypernetwork-guided multitask heads to learn shared representation modules and task-specific predictors. Specifically, the domain-adaptive pretrained model is directly pretrained in the target domain of clinical applications. Task embeddings as task context are used to generate task-specific parameters with hypernetworks. Experiments show that the proposed hypernetwork-guided multitask learning method can achieve better predictive performance and semantic task information can improve the generalizability of the task-conditioned multitask model.
Supervising professor
Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland
Thesis advisor
Marttinen, Pekka, Prof., Aalto University, Department of Computer Science, Finland
natural language processing, healthcare applications, text representation, multitask learning
Other note
  • [Publication 1]: Shaoxiong Ji, Erik Cambria, and Pekka Marttinen. Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, Virtual, pages 73-78, 2020
  • [Publication 2]: Shaoxiong Ji, Shirui Pan, and Pekka Marttinen. Medical Code Assignment with Gated Convolution and Note-Code Interaction. In Findings of the Association for Computational Linguistics: ACLIJCNLP 2021, Virtual, pages 1034-1043, 2021.
    Full text in Acris/Aaltodoc:
    DOI: 10.18653/v1/2021.findings-acl.89 View at publisher
  • [Publication 3]: Shaoxiong Ji, Matti Hölttä, and Pekka Marttinen. Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative Study. Computers in Biology and Medicine, Volume 139, 104998, 2021.
    DOI: 10.1016/j.compbiomed.2021.104998 View at publisher
  • [Publication 4]: Wei Sun, Shaoxiong Ji, Erik Cambria, and Pekka Marttinen. Multitask Recalibrated Aggregation Network for Medical Code Prediction. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), vol 12978, Springer, Cham. 2021.
    DOI: 10.1007/978-3-030-86514-6_23 View at publisher
  • [Publication 5]: Wei Sun, Shaoxiong Ji, Erik Cambria, and Pekka Marttinen. Multitask Balanced and Recalibrated Network for Medical Code Prediction. ACM Transactions on Intelligent Systems and Technology, Aug 2022.
    Full text in Acris/Aaltodoc:
    DOI: 10.1145/3563041 View at publisher
  • [Publication 6]: Shaoxiong Ji and Pekka Marttinen. Patient Outcome and Zero-shot Diagnosis Prediction with Hypernetwork-guided Multitask Learning. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
  • [Publication 7]: Shaoxiong Ji, Wei Sun, Hang Dong, Honghan Wu, and Pekka Marttinen. A Unified Review of Deep Learning for Automated Medical Coding. arXiv preprint arXiv:2201.02797, Jan 2022
  • [Publication 8]: Ya Gao, Shaoxiong Ji, Tongxuan Zhang, Prayag Tiwari, and Pekka Marttinen. Contextualized Graph Embeddings for Adverse Drug Event Detection. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), Springer, Cham. 2022