Cargando…

Automated cleansing and harmonization of international trade data

Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first...

Descripción completa

Detalles Bibliográficos
Autores principales: Oliveira, Sandra, Capinha, César, Rocha, Jorge
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720831/
https://www.ncbi.nlm.nih.gov/pubmed/35004201
http://dx.doi.org/10.1016/j.mex.2021.101567
_version_ 1784625207922655232
author Oliveira, Sandra
Capinha, César
Rocha, Jorge
author_facet Oliveira, Sandra
Capinha, César
Rocha, Jorge
author_sort Oliveira, Sandra
collection PubMed
description Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues. • A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented • Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models • Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomena
format Online
Article
Text
id pubmed-8720831
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-87208312022-01-07 Automated cleansing and harmonization of international trade data Oliveira, Sandra Capinha, César Rocha, Jorge MethodsX Method Article Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues. • A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented • Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models • Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomena Elsevier 2021-11-02 /pmc/articles/PMC8720831/ /pubmed/35004201 http://dx.doi.org/10.1016/j.mex.2021.101567 Text en © 2021 The Authors. Published by Elsevier B.V. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Method Article
Oliveira, Sandra
Capinha, César
Rocha, Jorge
Automated cleansing and harmonization of international trade data
title Automated cleansing and harmonization of international trade data
title_full Automated cleansing and harmonization of international trade data
title_fullStr Automated cleansing and harmonization of international trade data
title_full_unstemmed Automated cleansing and harmonization of international trade data
title_short Automated cleansing and harmonization of international trade data
title_sort automated cleansing and harmonization of international trade data
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8720831/
https://www.ncbi.nlm.nih.gov/pubmed/35004201
http://dx.doi.org/10.1016/j.mex.2021.101567
work_keys_str_mv AT oliveirasandra automatedcleansingandharmonizationofinternationaltradedata
AT capinhacesar automatedcleansingandharmonizationofinternationaltradedata
AT rochajorge automatedcleansingandharmonizationofinternationaltradedata