Cargando…
COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records
In 2019, COVID-19 quickly spread across the world, infecting billions of people and disrupting the normal lives of citizens in every country. Governments, organizations, and research institutions all over the world are dedicating vast resources to research effective strategies to fight this rapidly...
Formato: | Online Artículo Texto |
---|---|
Lenguaje: | English |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545187/ https://www.ncbi.nlm.nih.gov/pubmed/34812396 http://dx.doi.org/10.1109/ACCESS.2021.3085682 |
_version_ | 1784589964599623680 |
---|---|
collection | PubMed |
description | In 2019, COVID-19 quickly spread across the world, infecting billions of people and disrupting the normal lives of citizens in every country. Governments, organizations, and research institutions all over the world are dedicating vast resources to research effective strategies to fight this rapidly propagating virus. With virus testing, most countries publish the number of confirmed cases, dead cases, recovered cases, and locations routinely through various channels and forms. This important data source has enabled researchers worldwide to perform different COVID-19 scientific studies, such as modeling this virus’s spreading patterns, developing prevention strategies, and studying the impact of COVID-19 on other aspects of society. However, one major challenge is that there is no standardized, updated, and high-quality data product that covers COVID-19 cases data internationally. This is because different countries may publish their data in unique channels, formats, and time intervals, which hinders researchers from fetching necessary COVID-19 datasets effectively, especially for fine-scale studies. Although existing solutions such as John’s Hopkins COVID-19 Dashboard and 1point3acres COVID-19 tracker are widely used, it is difficult for users to access their original dataset and customize those data to meet specific requirements in categories, data structure, and data source selection. To address this challenge, we developed a toolset using cloud-based web scraping to extract, refine, unify, and store COVID-19 cases data at multiple scales for all available countries around the world automatically. The toolset then publishes the data for public access in an effective manner, which could offer users a real time COVID-19 dynamic dataset with a global view. Two case studies are presented about how to utilize the datasets. This toolset can also be easily extended to fulfill other purposes with its open-source nature. |
format | Online Article Text |
id | pubmed-8545187 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | IEEE |
record_format | MEDLINE/PubMed |
spelling | pubmed-85451872021-11-18 COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records IEEE Access Science - General In 2019, COVID-19 quickly spread across the world, infecting billions of people and disrupting the normal lives of citizens in every country. Governments, organizations, and research institutions all over the world are dedicating vast resources to research effective strategies to fight this rapidly propagating virus. With virus testing, most countries publish the number of confirmed cases, dead cases, recovered cases, and locations routinely through various channels and forms. This important data source has enabled researchers worldwide to perform different COVID-19 scientific studies, such as modeling this virus’s spreading patterns, developing prevention strategies, and studying the impact of COVID-19 on other aspects of society. However, one major challenge is that there is no standardized, updated, and high-quality data product that covers COVID-19 cases data internationally. This is because different countries may publish their data in unique channels, formats, and time intervals, which hinders researchers from fetching necessary COVID-19 datasets effectively, especially for fine-scale studies. Although existing solutions such as John’s Hopkins COVID-19 Dashboard and 1point3acres COVID-19 tracker are widely used, it is difficult for users to access their original dataset and customize those data to meet specific requirements in categories, data structure, and data source selection. To address this challenge, we developed a toolset using cloud-based web scraping to extract, refine, unify, and store COVID-19 cases data at multiple scales for all available countries around the world automatically. The toolset then publishes the data for public access in an effective manner, which could offer users a real time COVID-19 dynamic dataset with a global view. Two case studies are presented about how to utilize the datasets. This toolset can also be easily extended to fulfill other purposes with its open-source nature. IEEE 2021-06-03 /pmc/articles/PMC8545187/ /pubmed/34812396 http://dx.doi.org/10.1109/ACCESS.2021.3085682 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Science - General COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title | COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title_full | COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title_fullStr | COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title_full_unstemmed | COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title_short | COVID-Scraper: An Open-Source Toolset for Automatically Scraping and Processing Global Multi-Scale Spatiotemporal COVID-19 Records |
title_sort | covid-scraper: an open-source toolset for automatically scraping and processing global multi-scale spatiotemporal covid-19 records |
topic | Science - General |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545187/ https://www.ncbi.nlm.nih.gov/pubmed/34812396 http://dx.doi.org/10.1109/ACCESS.2021.3085682 |
work_keys_str_mv | AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records AT covidscraperanopensourcetoolsetforautomaticallyscrapingandprocessingglobalmultiscalespatiotemporalcovid19records |