Cargando…
A data science perspective of real-world COVID-19 databases
The COVID-19 pandemic has devastated the lives of millions of people worldwide and damaged the economy of many countries. While the negative impact of the pandemic on mankind is unimaginable, this pandemic has triggered new research and innovation in the use of artificial intelligence for developing...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8342407/ http://dx.doi.org/10.1016/B978-0-323-89777-8.00008-7 |
_version_ | 1783734062529118208 |
---|---|
author | Prasanna, Shivika Rao, Praveen |
author_facet | Prasanna, Shivika Rao, Praveen |
author_sort | Prasanna, Shivika |
collection | PubMed |
description | The COVID-19 pandemic has devastated the lives of millions of people worldwide and damaged the economy of many countries. While the negative impact of the pandemic on mankind is unimaginable, this pandemic has triggered new research and innovation in the use of artificial intelligence for developing solutions to better understand and mitigate the pandemic. Several valuable datasets have been made available by different organizations and research groups. In this chapter, we provide an overview of real-world COVID-19 data sources available for developing novel applications and solutions for the pandemic. We provide a comparison between them from a data science perspective. Next, we delve deep into the Cerner Real-World Data for COVID-19. We discuss the schema of the database, data quality issues, data wrangling using Apache Spark, and data analysis using popular machine learning techniques. Specifically, we provide examples of querying the database, training machine learning models, and visualization. We also discuss the technical challenges that we encountered and how we overcame them to complete multiple clinical studies on COVID-19. |
format | Online Article Text |
id | pubmed-8342407 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-83424072021-08-06 A data science perspective of real-world COVID-19 databases Prasanna, Shivika Rao, Praveen Leveraging Artificial Intelligence in Global Epidemics Article The COVID-19 pandemic has devastated the lives of millions of people worldwide and damaged the economy of many countries. While the negative impact of the pandemic on mankind is unimaginable, this pandemic has triggered new research and innovation in the use of artificial intelligence for developing solutions to better understand and mitigate the pandemic. Several valuable datasets have been made available by different organizations and research groups. In this chapter, we provide an overview of real-world COVID-19 data sources available for developing novel applications and solutions for the pandemic. We provide a comparison between them from a data science perspective. Next, we delve deep into the Cerner Real-World Data for COVID-19. We discuss the schema of the database, data quality issues, data wrangling using Apache Spark, and data analysis using popular machine learning techniques. Specifically, we provide examples of querying the database, training machine learning models, and visualization. We also discuss the technical challenges that we encountered and how we overcame them to complete multiple clinical studies on COVID-19. 2021 2021-08-06 /pmc/articles/PMC8342407/ http://dx.doi.org/10.1016/B978-0-323-89777-8.00008-7 Text en Copyright © 2021 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Prasanna, Shivika Rao, Praveen A data science perspective of real-world COVID-19 databases |
title | A data science perspective of real-world COVID-19 databases |
title_full | A data science perspective of real-world COVID-19 databases |
title_fullStr | A data science perspective of real-world COVID-19 databases |
title_full_unstemmed | A data science perspective of real-world COVID-19 databases |
title_short | A data science perspective of real-world COVID-19 databases |
title_sort | data science perspective of real-world covid-19 databases |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8342407/ http://dx.doi.org/10.1016/B978-0-323-89777-8.00008-7 |
work_keys_str_mv | AT prasannashivika adatascienceperspectiveofrealworldcovid19databases AT raopraveen adatascienceperspectiveofrealworldcovid19databases AT prasannashivika datascienceperspectiveofrealworldcovid19databases AT raopraveen datascienceperspectiveofrealworldcovid19databases |