Cargando…

Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM

COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Ce...

Descripción completa

Detalles Bibliográficos
Autores principales: Vadyala, Shashank Reddy, Betgeri, Sai Nethra, Sherer, Eric A., Amritphale, Amod
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378999/
https://www.ncbi.nlm.nih.gov/pubmed/35083430
http://dx.doi.org/10.1016/j.array.2021.100085
_version_ 1783740918724034560
author Vadyala, Shashank Reddy
Betgeri, Sai Nethra
Sherer, Eric A.
Amritphale, Amod
author_facet Vadyala, Shashank Reddy
Betgeri, Sai Nethra
Sherer, Eric A.
Amritphale, Amod
author_sort Vadyala, Shashank Reddy
collection PubMed
description COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often “overfit” the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation. In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA. The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83.
format Online
Article
Text
id pubmed-8378999
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Authors. Published by Elsevier Inc.
record_format MEDLINE/PubMed
spelling pubmed-83789992021-08-23 Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM Vadyala, Shashank Reddy Betgeri, Sai Nethra Sherer, Eric A. Amritphale, Amod Array (N Y) Article COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often “overfit” the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation. In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA. The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83. The Authors. Published by Elsevier Inc. 2021-09 2021-08-21 /pmc/articles/PMC8378999/ /pubmed/35083430 http://dx.doi.org/10.1016/j.array.2021.100085 Text en © 2021 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Vadyala, Shashank Reddy
Betgeri, Sai Nethra
Sherer, Eric A.
Amritphale, Amod
Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title_full Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title_fullStr Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title_full_unstemmed Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title_short Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
title_sort prediction of the number of covid-19 confirmed cases based on k-means-lstm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378999/
https://www.ncbi.nlm.nih.gov/pubmed/35083430
http://dx.doi.org/10.1016/j.array.2021.100085
work_keys_str_mv AT vadyalashashankreddy predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm
AT betgerisainethra predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm
AT sherererica predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm
AT amritphaleamod predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm