Cargando…
A data science perspective of real-world COVID-19 databases
The COVID-19 pandemic has devastated the lives of millions of people worldwide and damaged the economy of many countries. While the negative impact of the pandemic on mankind is unimaginable, this pandemic has triggered new research and innovation in the use of artificial intelligence for developing...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8342407/ http://dx.doi.org/10.1016/B978-0-323-89777-8.00008-7 |
Sumario: | The COVID-19 pandemic has devastated the lives of millions of people worldwide and damaged the economy of many countries. While the negative impact of the pandemic on mankind is unimaginable, this pandemic has triggered new research and innovation in the use of artificial intelligence for developing solutions to better understand and mitigate the pandemic. Several valuable datasets have been made available by different organizations and research groups. In this chapter, we provide an overview of real-world COVID-19 data sources available for developing novel applications and solutions for the pandemic. We provide a comparison between them from a data science perspective. Next, we delve deep into the Cerner Real-World Data for COVID-19. We discuss the schema of the database, data quality issues, data wrangling using Apache Spark, and data analysis using popular machine learning techniques. Specifically, we provide examples of querying the database, training machine learning models, and visualization. We also discuss the technical challenges that we encountered and how we overcame them to complete multiple clinical studies on COVID-19. |
---|