Cargando…

Utilization of social media in floods assessment using data mining techniques

Floods are among the devastating types of disasters in terms of human life, social and financial losses. Authoritative data from flood gauges are scarce in arid regions because of the specific type of dry climate that dysfunctions these measuring devices. Hence, social media data could be a useful t...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Qasim, Kalbus, Edda, Zaki, Nazar, Mohamed, Mohamed Mostafa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9037947/
https://www.ncbi.nlm.nih.gov/pubmed/35468157
http://dx.doi.org/10.1371/journal.pone.0267079
_version_ 1784693829466587136
author Khan, Qasim
Kalbus, Edda
Zaki, Nazar
Mohamed, Mohamed Mostafa
author_facet Khan, Qasim
Kalbus, Edda
Zaki, Nazar
Mohamed, Mohamed Mostafa
author_sort Khan, Qasim
collection PubMed
description Floods are among the devastating types of disasters in terms of human life, social and financial losses. Authoritative data from flood gauges are scarce in arid regions because of the specific type of dry climate that dysfunctions these measuring devices. Hence, social media data could be a useful tool in this case, where a wealth of information is available online. This study investigates the reliability of flood related data quality collected from social media, particularly for an arid region where the usage of flow gauges is limited. The data (text, images and videos) of social media, related to a flood event, was analyzed using the Machine Learning approach. For this reason, digital data (758 images and 1413 video frames) was converted into numeric values through ResNet50 model using the VGG-16 architecture. Numeric data of images, videos and text was further classified using different Machine Learning algorithms. Receiver operating characteristics (ROC) curve and area under curve (AUC) methods were used to evaluate and compare the performance of the developed machine learning algorithms. This novel approach of studying the quality of social media data could be a reliable alternative in the absence of real-time flow gauges data. A flash flood that occurred in the United Arab Emirates (UAE) from March 7–11, 2016 was selected as the focus of this study. Random forest showed the highest accuracy of 80.18% among the five other classifiers for images and videos. Precipitation/rainfall data were used to validate social media data, which showed a significant relationship between rainfall and the number of posts. The validity of the machine learning models was assessed using the area under the curve, precision-recall curve, root mean square error, and kappa statistics to confirm the validity and accuracy of the model. The data quality of YouTube videos was found to have the highest accuracy followed by Facebook, Flickr, Twitter, and Instagram. These results showed that social media data could be used when gauge data is unavailable.
format Online
Article
Text
id pubmed-9037947
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-90379472022-04-26 Utilization of social media in floods assessment using data mining techniques Khan, Qasim Kalbus, Edda Zaki, Nazar Mohamed, Mohamed Mostafa PLoS One Research Article Floods are among the devastating types of disasters in terms of human life, social and financial losses. Authoritative data from flood gauges are scarce in arid regions because of the specific type of dry climate that dysfunctions these measuring devices. Hence, social media data could be a useful tool in this case, where a wealth of information is available online. This study investigates the reliability of flood related data quality collected from social media, particularly for an arid region where the usage of flow gauges is limited. The data (text, images and videos) of social media, related to a flood event, was analyzed using the Machine Learning approach. For this reason, digital data (758 images and 1413 video frames) was converted into numeric values through ResNet50 model using the VGG-16 architecture. Numeric data of images, videos and text was further classified using different Machine Learning algorithms. Receiver operating characteristics (ROC) curve and area under curve (AUC) methods were used to evaluate and compare the performance of the developed machine learning algorithms. This novel approach of studying the quality of social media data could be a reliable alternative in the absence of real-time flow gauges data. A flash flood that occurred in the United Arab Emirates (UAE) from March 7–11, 2016 was selected as the focus of this study. Random forest showed the highest accuracy of 80.18% among the five other classifiers for images and videos. Precipitation/rainfall data were used to validate social media data, which showed a significant relationship between rainfall and the number of posts. The validity of the machine learning models was assessed using the area under the curve, precision-recall curve, root mean square error, and kappa statistics to confirm the validity and accuracy of the model. The data quality of YouTube videos was found to have the highest accuracy followed by Facebook, Flickr, Twitter, and Instagram. These results showed that social media data could be used when gauge data is unavailable. Public Library of Science 2022-04-25 /pmc/articles/PMC9037947/ /pubmed/35468157 http://dx.doi.org/10.1371/journal.pone.0267079 Text en © 2022 Khan et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Khan, Qasim
Kalbus, Edda
Zaki, Nazar
Mohamed, Mohamed Mostafa
Utilization of social media in floods assessment using data mining techniques
title Utilization of social media in floods assessment using data mining techniques
title_full Utilization of social media in floods assessment using data mining techniques
title_fullStr Utilization of social media in floods assessment using data mining techniques
title_full_unstemmed Utilization of social media in floods assessment using data mining techniques
title_short Utilization of social media in floods assessment using data mining techniques
title_sort utilization of social media in floods assessment using data mining techniques
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9037947/
https://www.ncbi.nlm.nih.gov/pubmed/35468157
http://dx.doi.org/10.1371/journal.pone.0267079
work_keys_str_mv AT khanqasim utilizationofsocialmediainfloodsassessmentusingdataminingtechniques
AT kalbusedda utilizationofsocialmediainfloodsassessmentusingdataminingtechniques
AT zakinazar utilizationofsocialmediainfloodsassessmentusingdataminingtechniques
AT mohamedmohamedmostafa utilizationofsocialmediainfloodsassessmentusingdataminingtechniques