DiMMA: A visual-language framework for disaster analysis and reporting
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Electrical Engineering |
Master's thesis
Authors
Date
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
70
Series
Abstract
Natural disasters are increasing in frequency, posing serious challenges to society. Social media has become a crucial source of real-time disaster information, providing images and text that can support disaster response efforts. This paper introduces DiMMA, a Disaster Multimodal Analysis framework that leverages visual-language models and large language models to automate disaster classification, image captioning, and report generation. DiMMA identifies disaster-related images, generates descriptive captions, and produces structured reports to aid emergency response. Experiments on the CrisisMMD dataset demonstrate that DiMMA improves classification accuracy, caption quality, and report reliability compared to existing methods. By automating disaster analysis, DiMMA enhances situational awareness and facilitates faster decision-making. Future work will focus on improving adaptability, expanding datasets, and optimizing model selection.Description
Supervisor
Kyrki, VilleThesis advisor
Hannus, EricTran, Nguyen Le