DiMMA: A visual-language framework for disaster analysis and reporting

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Electrical Engineering | Master's thesis

Authors

Department

Mcode

Language

en

Pages

70

Series

Abstract

Natural disasters are increasing in frequency, posing serious challenges to society. Social media has become a crucial source of real-time disaster information, providing images and text that can support disaster response efforts. This paper introduces DiMMA, a Disaster Multimodal Analysis framework that leverages visual-language models and large language models to automate disaster classification, image captioning, and report generation. DiMMA identifies disaster-related images, generates descriptive captions, and produces structured reports to aid emergency response. Experiments on the CrisisMMD dataset demonstrate that DiMMA improves classification accuracy, caption quality, and report reliability compared to existing methods. By automating disaster analysis, DiMMA enhances situational awareness and facilitates faster decision-making. Future work will focus on improving adaptability, expanding datasets, and optimizing model selection.

Description

Supervisor

Kyrki, Ville

Thesis advisor

Hannus, Eric
Tran, Nguyen Le

Other note

Citation