Cargando…
Brazilian disaster datasets and real-world instances for optimization and machine learning
We present comprehensive datasets of Brazilian disasters from January 2003 to February 2021 as well as real-world optimization instances built up from these data. The data were gathered through a series of open available reports obtained from different government and institutional sources. Afterward...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8931360/ https://www.ncbi.nlm.nih.gov/pubmed/35310816 http://dx.doi.org/10.1016/j.dib.2022.108012 |
_version_ | 1784671242336337920 |
---|---|
author | Veloso, Rafaela Cespedes, Juliana Caunhye, Aakil Alem, Douglas |
author_facet | Veloso, Rafaela Cespedes, Juliana Caunhye, Aakil Alem, Douglas |
author_sort | Veloso, Rafaela |
collection | PubMed |
description | We present comprehensive datasets of Brazilian disasters from January 2003 to February 2021 as well as real-world optimization instances built up from these data. The data were gathered through a series of open available reports obtained from different government and institutional sources. Afterwards, data consolidation and summarization were carried out using Excel and Python. The datasets include 9 types of disaster, such as flash floods, landslides and droughts, and the corresponding number of affected people during an 18-year or a 218-month observation period for 5,402 Brazilian municipalities, totaling more than 65,000 observations. Data on relevant geographical, demographic and socioeconomic aspects of the affected municipalities are also provided. These encompass geographic coordinates, regions, population, income, development indicators, amongst other information. From a statistical point of view, the data on disasters can address a number of applications using both supervised and unsupervised machine learning techniques such as, for time series analysis or other dynamic models using socioeconomic data as explanatory variables, i.e. data on the size of the poor population, income, education and general development. The geographic dataset can be useful for aggregating analyses concerning the various forms of territorial organization and allows for the visualization of data in maps. All the aforementioned data can be also used to devise realistic optimization instances related to diverse humanitarian logistics and/or disaster management problems, such as facility location, location-allocation, vehicle routing, and so forth. In particular, we describe two real-world instances for the location-allocation problem studied in [1]. For that purpose, we partially use the given datasets and included other information such as costs and distances relevant to the optimization model. Although using real-world cases to test optimization approaches is a common and encouraged practice in Operations Research, comprehensive datasets and practical optimization instances, as presented in this article, are rarely described and/or available in the academic literature. |
format | Online Article Text |
id | pubmed-8931360 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-89313602022-03-19 Brazilian disaster datasets and real-world instances for optimization and machine learning Veloso, Rafaela Cespedes, Juliana Caunhye, Aakil Alem, Douglas Data Brief Data Article We present comprehensive datasets of Brazilian disasters from January 2003 to February 2021 as well as real-world optimization instances built up from these data. The data were gathered through a series of open available reports obtained from different government and institutional sources. Afterwards, data consolidation and summarization were carried out using Excel and Python. The datasets include 9 types of disaster, such as flash floods, landslides and droughts, and the corresponding number of affected people during an 18-year or a 218-month observation period for 5,402 Brazilian municipalities, totaling more than 65,000 observations. Data on relevant geographical, demographic and socioeconomic aspects of the affected municipalities are also provided. These encompass geographic coordinates, regions, population, income, development indicators, amongst other information. From a statistical point of view, the data on disasters can address a number of applications using both supervised and unsupervised machine learning techniques such as, for time series analysis or other dynamic models using socioeconomic data as explanatory variables, i.e. data on the size of the poor population, income, education and general development. The geographic dataset can be useful for aggregating analyses concerning the various forms of territorial organization and allows for the visualization of data in maps. All the aforementioned data can be also used to devise realistic optimization instances related to diverse humanitarian logistics and/or disaster management problems, such as facility location, location-allocation, vehicle routing, and so forth. In particular, we describe two real-world instances for the location-allocation problem studied in [1]. For that purpose, we partially use the given datasets and included other information such as costs and distances relevant to the optimization model. Although using real-world cases to test optimization approaches is a common and encouraged practice in Operations Research, comprehensive datasets and practical optimization instances, as presented in this article, are rarely described and/or available in the academic literature. Elsevier 2022-03-05 /pmc/articles/PMC8931360/ /pubmed/35310816 http://dx.doi.org/10.1016/j.dib.2022.108012 Text en © 2022 The Authors. Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Veloso, Rafaela Cespedes, Juliana Caunhye, Aakil Alem, Douglas Brazilian disaster datasets and real-world instances for optimization and machine learning |
title | Brazilian disaster datasets and real-world instances for optimization and machine learning |
title_full | Brazilian disaster datasets and real-world instances for optimization and machine learning |
title_fullStr | Brazilian disaster datasets and real-world instances for optimization and machine learning |
title_full_unstemmed | Brazilian disaster datasets and real-world instances for optimization and machine learning |
title_short | Brazilian disaster datasets and real-world instances for optimization and machine learning |
title_sort | brazilian disaster datasets and real-world instances for optimization and machine learning |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8931360/ https://www.ncbi.nlm.nih.gov/pubmed/35310816 http://dx.doi.org/10.1016/j.dib.2022.108012 |
work_keys_str_mv | AT velosorafaela braziliandisasterdatasetsandrealworldinstancesforoptimizationandmachinelearning AT cespedesjuliana braziliandisasterdatasetsandrealworldinstancesforoptimizationandmachinelearning AT caunhyeaakil braziliandisasterdatasetsandrealworldinstancesforoptimizationandmachinelearning AT alemdouglas braziliandisasterdatasetsandrealworldinstancesforoptimizationandmachinelearning |