Cargando…
Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. M...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881817/ https://www.ncbi.nlm.nih.gov/pubmed/35240374 http://dx.doi.org/10.1016/j.compbiomed.2022.105354 |
_version_ | 1784659562460086272 |
---|---|
author | Mohan, Sumit Solanki, Anil Kumar Taluja, Harish Kumar Anuradha Singh, Anuj |
author_facet | Mohan, Sumit Solanki, Anil Kumar Taluja, Harish Kumar Anuradha Singh, Anuj |
author_sort | Mohan, Sumit |
collection | PubMed |
description | BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. METHODS: This study proposed a hybrid autoregressive integrated moving average (ARIMA) and Prophet model to predict daily confirmed and cumulative confirmed cases. The built-in auto.arima function was first used to select the optimal hyperparameter values of the ARIMA model. Then, the modified ARIMA model was used to find the best fit between the test and forecast data to find the best model parameter combinations. Articles, blog posts, and news stories from virologists, scientists, and health experts related to the third wave of COVID-19 were gathered using the Python web scraping package Beautiful Soup. Their opinions (sentiments) toward the potential third wave were analyzed using natural language processing (NLP) libraries. RESULTS: A spike in daily confirmed and cumulative confirmed cases was predicted in India in the next 180 days based on past time series data. The results were validated using various analytical tools and evaluation metrics, producing a root mean square error (RMSE) of 0.14 and a mean absolute percentage error (MAPE) of 0.06. The NLP processing results revealed negative sentiments in most articles and blogs, with few exceptions. CONCLUSION: The findings of this study suggest that there will be more active cases in the upcoming days. The proposed models can forecast future daily confirmed and cumulative confirmed cases. This study will help the country and states plan appropriate public health measures for the upcoming waves of COVID-19. |
format | Online Article Text |
id | pubmed-8881817 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-88818172022-02-28 Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach Mohan, Sumit Solanki, Anil Kumar Taluja, Harish Kumar Anuradha Singh, Anuj Comput Biol Med Article BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. METHODS: This study proposed a hybrid autoregressive integrated moving average (ARIMA) and Prophet model to predict daily confirmed and cumulative confirmed cases. The built-in auto.arima function was first used to select the optimal hyperparameter values of the ARIMA model. Then, the modified ARIMA model was used to find the best fit between the test and forecast data to find the best model parameter combinations. Articles, blog posts, and news stories from virologists, scientists, and health experts related to the third wave of COVID-19 were gathered using the Python web scraping package Beautiful Soup. Their opinions (sentiments) toward the potential third wave were analyzed using natural language processing (NLP) libraries. RESULTS: A spike in daily confirmed and cumulative confirmed cases was predicted in India in the next 180 days based on past time series data. The results were validated using various analytical tools and evaluation metrics, producing a root mean square error (RMSE) of 0.14 and a mean absolute percentage error (MAPE) of 0.06. The NLP processing results revealed negative sentiments in most articles and blogs, with few exceptions. CONCLUSION: The findings of this study suggest that there will be more active cases in the upcoming days. The proposed models can forecast future daily confirmed and cumulative confirmed cases. This study will help the country and states plan appropriate public health measures for the upcoming waves of COVID-19. Elsevier Ltd. 2022-05 2022-02-26 /pmc/articles/PMC8881817/ /pubmed/35240374 http://dx.doi.org/10.1016/j.compbiomed.2022.105354 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Mohan, Sumit Solanki, Anil Kumar Taluja, Harish Kumar Anuradha Singh, Anuj Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title | Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title_full | Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title_fullStr | Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title_full_unstemmed | Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title_short | Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach |
title_sort | predicting the impact of the third wave of covid-19 in india using hybrid statistical machine learning models: a time series forecasting and sentiment analysis approach |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881817/ https://www.ncbi.nlm.nih.gov/pubmed/35240374 http://dx.doi.org/10.1016/j.compbiomed.2022.105354 |
work_keys_str_mv | AT mohansumit predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach AT solankianilkumar predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach AT talujaharishkumar predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach AT anuradha predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach AT singhanuj predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach |