Cargando…

Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak

BACKGROUND: Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak pre...

Descripción completa

Detalles Bibliográficos
Autores principales:	Naeem, Muhammad, Yu, Jian, Aamir, Muhammad, Khan, Sajjad Ahmad, Adeleye, Olayinka, Khan, Zardad
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Bioinformatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725668/ https://www.ncbi.nlm.nih.gov/pubmed/35036527 http://dx.doi.org/10.7717/peerj-cs.746

_version_	1784626163658784768
author	Naeem, Muhammad Yu, Jian Aamir, Muhammad Khan, Sajjad Ahmad Adeleye, Olayinka Khan, Zardad
author_facet	Naeem, Muhammad Yu, Jian Aamir, Muhammad Khan, Sajjad Ahmad Adeleye, Olayinka Khan, Zardad
author_sort	Naeem, Muhammad
collection	PubMed
description	BACKGROUND: Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research community to track and predict the trend of the epidemic, and also in developing appropriate strategies to combat and manage its spread. METHODS: In this paper, we present a comparative analysis of various ML approaches including Support Vector Machine, Random Forest, K-Nearest Neighbor and Artificial Neural Network in predicting the COVID-19 outbreak in the epidemiological domain. We first apply the autoregressive distributed lag (ARDL) method to identify and model the short and long-run relationships of the time-series COVID-19 datasets. That is, we determine the lags between a response variable and its respective explanatory time series variables as independent variables. Then, the resulting significant variables concerning their lags are used in the regression model selected by the ARDL for predicting and forecasting the trend of the epidemic. RESULTS: Statistical measures—Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE)—are used for model accuracy. The values of MAPE for the best-selected models for confirmed, recovered and deaths cases are 0.003, 0.006 and 0.115, respectively, which falls under the category of highly accurate forecasts. In addition, we computed 15 days ahead forecast for the daily deaths, recovered, and confirm patients and the cases fluctuated across time in all aspects. Besides, the results reveal the advantages of ML algorithms for supporting the decision-making of evolving short-term policies.
format	Online Article Text
id	pubmed-8725668
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-87256682022-01-14 Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak Naeem, Muhammad Yu, Jian Aamir, Muhammad Khan, Sajjad Ahmad Adeleye, Olayinka Khan, Zardad PeerJ Comput Sci Bioinformatics BACKGROUND: Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research community to track and predict the trend of the epidemic, and also in developing appropriate strategies to combat and manage its spread. METHODS: In this paper, we present a comparative analysis of various ML approaches including Support Vector Machine, Random Forest, K-Nearest Neighbor and Artificial Neural Network in predicting the COVID-19 outbreak in the epidemiological domain. We first apply the autoregressive distributed lag (ARDL) method to identify and model the short and long-run relationships of the time-series COVID-19 datasets. That is, we determine the lags between a response variable and its respective explanatory time series variables as independent variables. Then, the resulting significant variables concerning their lags are used in the regression model selected by the ARDL for predicting and forecasting the trend of the epidemic. RESULTS: Statistical measures—Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE)—are used for model accuracy. The values of MAPE for the best-selected models for confirmed, recovered and deaths cases are 0.003, 0.006 and 0.115, respectively, which falls under the category of highly accurate forecasts. In addition, we computed 15 days ahead forecast for the daily deaths, recovered, and confirm patients and the cases fluctuated across time in all aspects. Besides, the results reveal the advantages of ML algorithms for supporting the decision-making of evolving short-term policies. PeerJ Inc. 2021-12-16 /pmc/articles/PMC8725668/ /pubmed/35036527 http://dx.doi.org/10.7717/peerj-cs.746 Text en © 2021 Naeem et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Bioinformatics Naeem, Muhammad Yu, Jian Aamir, Muhammad Khan, Sajjad Ahmad Adeleye, Olayinka Khan, Zardad Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title	Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title_full	Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title_fullStr	Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title_full_unstemmed	Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title_short	Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak
title_sort	comparative analysis of machine learning approaches to analyze and predict the covid-19 outbreak
topic	Bioinformatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725668/ https://www.ncbi.nlm.nih.gov/pubmed/35036527 http://dx.doi.org/10.7717/peerj-cs.746
work_keys_str_mv	AT naeemmuhammad comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak AT yujian comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak AT aamirmuhammad comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak AT khansajjadahmad comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak AT adeleyeolayinka comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak AT khanzardad comparativeanalysisofmachinelearningapproachestoanalyzeandpredictthecovid19outbreak

Comparative analysis of machine learning approaches to analyze and predict the COVID-19 outbreak

Ejemplares similares