Identification of Damage Assessment Tweets using Textual Features

2025-06-28 16:27:45 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

Our project is about to identify/predict the class label (human damage, infrastructure damage and non-damage) of tweet using textual and language-context features. Damage assessment is the method involved with deciding the area, nature, and seriousness of harm supported by people in general and private areas after a disaster. The typical damage assessment estimates the losses and the impacts of those losses on the affected individuals and communities. These days, Twitter has become better known among clients for imparting data, particularly during disasters. Twitter provides situational information on a wide range of social activities that are helpful during crisis events such as earthquakes, floods, cyclones, wildfire, etc. Identifying tweets related to the target event during a disaster is a challenging task. During and in the aftermath of a disaster people post sentiment tweets along with situational tweets are also posted. These tweets are not much useful in damage assessment. There are only a few works focused on detecting the damage assessment tweets.

A Novel method based on the weighted features using linear regression and SVR has been proposed to identify the damage assessment tweets during a disaster for both binary and multiclass classification. The experimental results show that the proposed method consistently improve existing methods on most of the datasets. Our proposed method effectively utilizes the low-level lexical features, top-most frequency word features, and syntactic features that are specific to damage assessment. These features are weighted by using simple linear regression and Support Vector Regression (SVR) algorithms. Later, the random forest technique is used as a classifier for classifying the tweets. We experimented on 5 standard disaster datasets of different categories for binary and multi-class classification. The proposed method gives an accuracy of 94.62% for detecting the damage assessment tweets.

Project Objectives

To design an automated system in order to identify the class label in the form of human damage, infrastructure damage and non-damage.
To utilize the latest textual and language context features to predict the damage from tweet effectively.
To facilitate the user, by providing interface to test the tweet and to predict the class label in less time.

Project Implementation Method

First gather the dataset regarding damages, study it, and analyze it.

Remove extra and unnecessary things to clean data, label data

Assign proper class label to dataset i.e. human damage, infrastructure damage or non-damage

Apply preprocessing steps

Then compute features i.e. low level lexical, top frequency and syntactic on selected dataset.

Benefits of the Project

Identification of damage assessment tweets is beneficial to both humanitarian organizations and victims during a disaster. The goal of this project is to classify the damage assessment tweet. This project will help humanitarian organizations to understand the seriousness of the damage and provide services to the people according to the emergency during a disaster. The main stakeholders of this project are disaster management and rehabilitation originations (i.e., NDMA, PDMA) and the NGOs. The objective of this project is not to earn but to serve the humanity.

Technical Details of Final Deliverable

First we have to select dataset and the disaster, then select feature for computation, after that apply relevant preprocessing technique after that assign weights, then select the parameters from machine learning model, evaluation metric and validation technique, then we train the model on the basis of selected parameters and analyze the performance in the form of metrics such as accuracy, precision, recall and f-measure. If the metrics are satisfactory we save the model otherwise we again train the model using different options. In addition we also provide interface, so that user can test the trained model using unseen data in order to identify the class label of tweet (whether it is a human damage or infrastructure damage or non-damage)

Final Deliverable of the Project Software SystemCore Industry EducationOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Good Health and Well-Being for PeopleRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	2800
Printing	Miscellaneous	7	400	2800

Identification of Damage Assessment Tweets using Textual Features

More Posts