Cargando…

Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach

BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. M...

Descripción completa

Detalles Bibliográficos
Autores principales: Mohan, Sumit, Solanki, Anil Kumar, Taluja, Harish Kumar, Anuradha, Singh, Anuj
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881817/
https://www.ncbi.nlm.nih.gov/pubmed/35240374
http://dx.doi.org/10.1016/j.compbiomed.2022.105354
_version_ 1784659562460086272
author Mohan, Sumit
Solanki, Anil Kumar
Taluja, Harish Kumar
Anuradha
Singh, Anuj
author_facet Mohan, Sumit
Solanki, Anil Kumar
Taluja, Harish Kumar
Anuradha
Singh, Anuj
author_sort Mohan, Sumit
collection PubMed
description BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. METHODS: This study proposed a hybrid autoregressive integrated moving average (ARIMA) and Prophet model to predict daily confirmed and cumulative confirmed cases. The built-in auto.arima function was first used to select the optimal hyperparameter values of the ARIMA model. Then, the modified ARIMA model was used to find the best fit between the test and forecast data to find the best model parameter combinations. Articles, blog posts, and news stories from virologists, scientists, and health experts related to the third wave of COVID-19 were gathered using the Python web scraping package Beautiful Soup. Their opinions (sentiments) toward the potential third wave were analyzed using natural language processing (NLP) libraries. RESULTS: A spike in daily confirmed and cumulative confirmed cases was predicted in India in the next 180 days based on past time series data. The results were validated using various analytical tools and evaluation metrics, producing a root mean square error (RMSE) of 0.14 and a mean absolute percentage error (MAPE) of 0.06. The NLP processing results revealed negative sentiments in most articles and blogs, with few exceptions. CONCLUSION: The findings of this study suggest that there will be more active cases in the upcoming days. The proposed models can forecast future daily confirmed and cumulative confirmed cases. This study will help the country and states plan appropriate public health measures for the upcoming waves of COVID-19.
format Online
Article
Text
id pubmed-8881817
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-88818172022-02-28 Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach Mohan, Sumit Solanki, Anil Kumar Taluja, Harish Kumar Anuradha Singh, Anuj Comput Biol Med Article BACKGROUND: Since January 2020, India has faced two waves of COVID-19; preparation for the upcoming waves is the primary challenge for public health sectors and governments. Therefore, it is important to forecast future cumulative confirmed cases to plan and implement control measures effectively. METHODS: This study proposed a hybrid autoregressive integrated moving average (ARIMA) and Prophet model to predict daily confirmed and cumulative confirmed cases. The built-in auto.arima function was first used to select the optimal hyperparameter values of the ARIMA model. Then, the modified ARIMA model was used to find the best fit between the test and forecast data to find the best model parameter combinations. Articles, blog posts, and news stories from virologists, scientists, and health experts related to the third wave of COVID-19 were gathered using the Python web scraping package Beautiful Soup. Their opinions (sentiments) toward the potential third wave were analyzed using natural language processing (NLP) libraries. RESULTS: A spike in daily confirmed and cumulative confirmed cases was predicted in India in the next 180 days based on past time series data. The results were validated using various analytical tools and evaluation metrics, producing a root mean square error (RMSE) of 0.14 and a mean absolute percentage error (MAPE) of 0.06. The NLP processing results revealed negative sentiments in most articles and blogs, with few exceptions. CONCLUSION: The findings of this study suggest that there will be more active cases in the upcoming days. The proposed models can forecast future daily confirmed and cumulative confirmed cases. This study will help the country and states plan appropriate public health measures for the upcoming waves of COVID-19. Elsevier Ltd. 2022-05 2022-02-26 /pmc/articles/PMC8881817/ /pubmed/35240374 http://dx.doi.org/10.1016/j.compbiomed.2022.105354 Text en © 2022 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Mohan, Sumit
Solanki, Anil Kumar
Taluja, Harish Kumar
Anuradha
Singh, Anuj
Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title_full Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title_fullStr Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title_full_unstemmed Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title_short Predicting the impact of the third wave of COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and sentiment analysis approach
title_sort predicting the impact of the third wave of covid-19 in india using hybrid statistical machine learning models: a time series forecasting and sentiment analysis approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8881817/
https://www.ncbi.nlm.nih.gov/pubmed/35240374
http://dx.doi.org/10.1016/j.compbiomed.2022.105354
work_keys_str_mv AT mohansumit predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach
AT solankianilkumar predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach
AT talujaharishkumar predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach
AT anuradha predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach
AT singhanuj predictingtheimpactofthethirdwaveofcovid19inindiausinghybridstatisticalmachinelearningmodelsatimeseriesforecastingandsentimentanalysisapproach