Cargando…

Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm

Estimation of the epidemiology curve for the COVID-19 pandemic can be a very computationally challenging task. Thus far, there have been some implementations of artificial intelligence (AI) methods applied to develop epidemiology curve for a specific country. However, most applied AI methods generat...

Descripción completa

Detalles Bibliográficos
Autores principales: Anđelić, Nikola, Šegota, Sandi Baressi, Lorencin, Ivan, Jurilj, Zdravko, Šušteršič, Tijana, Blagojević, Anđela, Protić, Alen, Ćabov, Tomislav, Filipović, Nenad, Car, Zlatan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7908446/
https://www.ncbi.nlm.nih.gov/pubmed/33499219
http://dx.doi.org/10.3390/ijerph18030959
_version_ 1783655715704930304
author Anđelić, Nikola
Šegota, Sandi Baressi
Lorencin, Ivan
Jurilj, Zdravko
Šušteršič, Tijana
Blagojević, Anđela
Protić, Alen
Ćabov, Tomislav
Filipović, Nenad
Car, Zlatan
author_facet Anđelić, Nikola
Šegota, Sandi Baressi
Lorencin, Ivan
Jurilj, Zdravko
Šušteršič, Tijana
Blagojević, Anđela
Protić, Alen
Ćabov, Tomislav
Filipović, Nenad
Car, Zlatan
author_sort Anđelić, Nikola
collection PubMed
description Estimation of the epidemiology curve for the COVID-19 pandemic can be a very computationally challenging task. Thus far, there have been some implementations of artificial intelligence (AI) methods applied to develop epidemiology curve for a specific country. However, most applied AI methods generated models that are almost impossible to translate into a mathematical equation. In this paper, the AI method called genetic programming (GP) algorithm is utilized to develop a symbolic expression (mathematical equation) which can be used for the estimation of the epidemiology curve for the entire U.S. with high accuracy. The GP algorithm is utilized on the publicly available dataset that contains the number of confirmed, deceased and recovered patients for each U.S. state to obtain the symbolic expression for the estimation of the number of the aforementioned patient groups. The dataset consists of the latitude and longitude of the central location for each state and the number of patients in each of the goal groups for each day in the period of 22 January 2020–3 December 2020. The obtained symbolic expressions for each state are summed up to obtain symbolic expressions for estimation of each of the patient groups (confirmed, deceased and recovered). These symbolic expressions are combined to obtain the symbolic expression for the estimation of the epidemiology curve for the entire U.S. The obtained symbolic expressions for the estimation of the number of confirmed, deceased and recovered patients for each state achieved [Formula: see text] score in the ranges 0.9406–0.9992, 0.9404–0.9998 and 0.9797–0.99955, respectively. These equations are summed up to formulate symbolic expressions for the estimation of the number of confirmed, deceased and recovered patients for the entire U.S. with achieved [Formula: see text] score of 0.9992, 0.9997 and 0.9996, respectively. Using these symbolic expressions, the equation for the estimation of the epidemiology curve for the entire U.S. is formulated which achieved [Formula: see text] score of 0.9933. Investigation showed that GP algorithm can produce symbolic expressions for the estimation of the number of confirmed, recovered and deceased patients as well as the epidemiology curve not only for the states but for the entire U.S. with very high accuracy.
format Online
Article
Text
id pubmed-7908446
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79084462021-02-27 Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm Anđelić, Nikola Šegota, Sandi Baressi Lorencin, Ivan Jurilj, Zdravko Šušteršič, Tijana Blagojević, Anđela Protić, Alen Ćabov, Tomislav Filipović, Nenad Car, Zlatan Int J Environ Res Public Health Article Estimation of the epidemiology curve for the COVID-19 pandemic can be a very computationally challenging task. Thus far, there have been some implementations of artificial intelligence (AI) methods applied to develop epidemiology curve for a specific country. However, most applied AI methods generated models that are almost impossible to translate into a mathematical equation. In this paper, the AI method called genetic programming (GP) algorithm is utilized to develop a symbolic expression (mathematical equation) which can be used for the estimation of the epidemiology curve for the entire U.S. with high accuracy. The GP algorithm is utilized on the publicly available dataset that contains the number of confirmed, deceased and recovered patients for each U.S. state to obtain the symbolic expression for the estimation of the number of the aforementioned patient groups. The dataset consists of the latitude and longitude of the central location for each state and the number of patients in each of the goal groups for each day in the period of 22 January 2020–3 December 2020. The obtained symbolic expressions for each state are summed up to obtain symbolic expressions for estimation of each of the patient groups (confirmed, deceased and recovered). These symbolic expressions are combined to obtain the symbolic expression for the estimation of the epidemiology curve for the entire U.S. The obtained symbolic expressions for the estimation of the number of confirmed, deceased and recovered patients for each state achieved [Formula: see text] score in the ranges 0.9406–0.9992, 0.9404–0.9998 and 0.9797–0.99955, respectively. These equations are summed up to formulate symbolic expressions for the estimation of the number of confirmed, deceased and recovered patients for the entire U.S. with achieved [Formula: see text] score of 0.9992, 0.9997 and 0.9996, respectively. Using these symbolic expressions, the equation for the estimation of the epidemiology curve for the entire U.S. is formulated which achieved [Formula: see text] score of 0.9933. Investigation showed that GP algorithm can produce symbolic expressions for the estimation of the number of confirmed, recovered and deceased patients as well as the epidemiology curve not only for the states but for the entire U.S. with very high accuracy. MDPI 2021-01-22 2021-02 /pmc/articles/PMC7908446/ /pubmed/33499219 http://dx.doi.org/10.3390/ijerph18030959 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Anđelić, Nikola
Šegota, Sandi Baressi
Lorencin, Ivan
Jurilj, Zdravko
Šušteršič, Tijana
Blagojević, Anđela
Protić, Alen
Ćabov, Tomislav
Filipović, Nenad
Car, Zlatan
Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title_full Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title_fullStr Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title_full_unstemmed Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title_short Estimation of COVID-19 Epidemiology Curve of the United States Using Genetic Programming Algorithm
title_sort estimation of covid-19 epidemiology curve of the united states using genetic programming algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7908446/
https://www.ncbi.nlm.nih.gov/pubmed/33499219
http://dx.doi.org/10.3390/ijerph18030959
work_keys_str_mv AT anđelicnikola estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT segotasandibaressi estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT lorencinivan estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT juriljzdravko estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT sustersictijana estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT blagojevicanđela estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT proticalen estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT cabovtomislav estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT filipovicnenad estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm
AT carzlatan estimationofcovid19epidemiologycurveoftheunitedstatesusinggeneticprogrammingalgorithm