Cargando…

An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science

Considering system theory, the socio-economic variables that constitute a society should be able to capture the system response such as the number of weekly COVID-19 cases. A numerical approach has been presented in this paper to answer two vital questions; which variables are more important and how...

Descripción completa

Detalles Bibliográficos
Autores principales: Pasha, Deepro F., Lundeen, Alex, Yeasmin, Dilruba, Pasha, M. Fayzul K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Author(s). Published by Elsevier Ltd. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834342/
http://dx.doi.org/10.1016/j.cscee.2020.100067
_version_ 1783642261057175552
author Pasha, Deepro F.
Lundeen, Alex
Yeasmin, Dilruba
Pasha, M. Fayzul K.
author_facet Pasha, Deepro F.
Lundeen, Alex
Yeasmin, Dilruba
Pasha, M. Fayzul K.
author_sort Pasha, Deepro F.
collection PubMed
description Considering system theory, the socio-economic variables that constitute a society should be able to capture the system response such as the number of weekly COVID-19 cases. A numerical approach has been presented in this paper to answer two vital questions; which variables are more important and how many variables are needed to capture the dynamics of the spread. Using the theory of least squares regression, two types of problems have been set up and solved using multilinear regression (MLR) and nonlinear powered function known as NLR in this study. Numerical techniques were applied to pre- and post-process the data and the vast number of outputs. Total 43 socio-economic and meteorological variables from 31 counties in California in the United States resulted about 37.4 millions of combinations for the analysis. Results show that variables related to total population, household income, occupation, and transportation are more important than the others. The frequency of having higher correlation for a variable increases as more variables are combined with it. Similarly, correlation increases as the number of variables in a combination increases. Some 5- variable combinations can capture the dynamics of the spread with higher accuracy having correlation coefficient as high as 0.985.
format Online
Article
Text
id pubmed-7834342
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Author(s). Published by Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-78343422021-01-26 An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science Pasha, Deepro F. Lundeen, Alex Yeasmin, Dilruba Pasha, M. Fayzul K. Case Studies in Chemical and Environmental Engineering Article Considering system theory, the socio-economic variables that constitute a society should be able to capture the system response such as the number of weekly COVID-19 cases. A numerical approach has been presented in this paper to answer two vital questions; which variables are more important and how many variables are needed to capture the dynamics of the spread. Using the theory of least squares regression, two types of problems have been set up and solved using multilinear regression (MLR) and nonlinear powered function known as NLR in this study. Numerical techniques were applied to pre- and post-process the data and the vast number of outputs. Total 43 socio-economic and meteorological variables from 31 counties in California in the United States resulted about 37.4 millions of combinations for the analysis. Results show that variables related to total population, household income, occupation, and transportation are more important than the others. The frequency of having higher correlation for a variable increases as more variables are combined with it. Similarly, correlation increases as the number of variables in a combination increases. Some 5- variable combinations can capture the dynamics of the spread with higher accuracy having correlation coefficient as high as 0.985. The Author(s). Published by Elsevier Ltd. 2021-06 2020-12-11 /pmc/articles/PMC7834342/ http://dx.doi.org/10.1016/j.cscee.2020.100067 Text en © 2020 The Author(s) Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Pasha, Deepro F.
Lundeen, Alex
Yeasmin, Dilruba
Pasha, M. Fayzul K.
An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title_full An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title_fullStr An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title_full_unstemmed An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title_short An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science
title_sort analysis to identify the important variables for the spread of covid-19 using numerical techniques and data science
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7834342/
http://dx.doi.org/10.1016/j.cscee.2020.100067
work_keys_str_mv AT pashadeeprof ananalysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT lundeenalex ananalysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT yeasmindilruba ananalysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT pashamfayzulk ananalysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT pashadeeprof analysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT lundeenalex analysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT yeasmindilruba analysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience
AT pashamfayzulk analysistoidentifytheimportantvariablesforthespreadofcovid19usingnumericaltechniquesanddatascience