Balancing Imbalanced Toxicity Models : Using MolBERT with Focal Loss
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2025
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
16
Series
AI in Drug Discovery - 1st International Workshop, AIDD 2024, Held in Conjunction with ICANN 2024, Proceedings, pp. 82-97, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; Volume 14894 LNCS
Abstract
Drug-induced liver injury (DILI) presents a multifaceted challenge, influenced by interconnected biological mechanisms. Current DILI datasets are characterized by small sizes and high imbalance, posing difficulties in learning robust representations and accurate modeling. To address these challenges, we trained a multi-modal multi-task model integrating preclinical histopathologies, biochemistry (blood markers), and clinical DILI-related adverse drug reactions (ADRs). Leveraging pretrained BERT models, we extracted representations covering a broad chemical space, facilitating robust learning in both frozen and fine-tuned settings. To address imbalanced data, we explored weighted Binary Cross-Entropy (w-BCE) and weighted Focal Loss (w-FL) . Our results demonstrate that the frozen BERT model consistently enhances performance across all metrics and modalities with weighted loss functions compared to their non-weighted counterparts. However, the efficacy of fine-tuning BERT varies across modalities, yielding inconclusive results. In summary, the incorporation of BERT features with weighted loss functions demonstrates advantages, while the efficacy of fine-tuning remains uncertain.Description
Publisher Copyright: © The Author(s) 2025. | openaire: EC/H2020/956832/EU//AIDD
Keywords
BERT, DILI, Focal loss, Toxicity
Other note
Citation
Masood, M A, Kaski, S, Ceulemans, H, Herman, D & Heinonen, M 2025, Balancing Imbalanced Toxicity Models : Using MolBERT with Focal Loss . in D-A Clevert, M Wand, J Schmidhuber, K Malinovská & I V Tetko (eds), AI in Drug Discovery - 1st International Workshop, AIDD 2024, Held in Conjunction with ICANN 2024, Proceedings . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14894 LNCS, Springer, pp. 82-97, International Workshop on AI in Drug Discovery, Lugano, Switzerland, 19/09/2024 . https://doi.org/10.1007/978-3-031-72381-0_8