Cargando…
Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM
COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Ce...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Authors. Published by Elsevier Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378999/ https://www.ncbi.nlm.nih.gov/pubmed/35083430 http://dx.doi.org/10.1016/j.array.2021.100085 |
_version_ | 1783740918724034560 |
---|---|
author | Vadyala, Shashank Reddy Betgeri, Sai Nethra Sherer, Eric A. Amritphale, Amod |
author_facet | Vadyala, Shashank Reddy Betgeri, Sai Nethra Sherer, Eric A. Amritphale, Amod |
author_sort | Vadyala, Shashank Reddy |
collection | PubMed |
description | COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often “overfit” the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation. In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA. The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83. |
format | Online Article Text |
id | pubmed-8378999 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | The Authors. Published by Elsevier Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-83789992021-08-23 Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM Vadyala, Shashank Reddy Betgeri, Sai Nethra Sherer, Eric A. Amritphale, Amod Array (N Y) Article COVID-19 is a pandemic disease that began to rapidly spread in the US, with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then quickly increased with total cases of 25,739 as of April 20, 2020. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often “overfit” the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation. In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA. The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83. The Authors. Published by Elsevier Inc. 2021-09 2021-08-21 /pmc/articles/PMC8378999/ /pubmed/35083430 http://dx.doi.org/10.1016/j.array.2021.100085 Text en © 2021 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Vadyala, Shashank Reddy Betgeri, Sai Nethra Sherer, Eric A. Amritphale, Amod Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title | Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title_full | Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title_fullStr | Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title_full_unstemmed | Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title_short | Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM |
title_sort | prediction of the number of covid-19 confirmed cases based on k-means-lstm |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378999/ https://www.ncbi.nlm.nih.gov/pubmed/35083430 http://dx.doi.org/10.1016/j.array.2021.100085 |
work_keys_str_mv | AT vadyalashashankreddy predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm AT betgerisainethra predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm AT sherererica predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm AT amritphaleamod predictionofthenumberofcovid19confirmedcasesbasedonkmeanslstm |