Cargando…

A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier

Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory s...

Descripción completa

Detalles Bibliográficos
Autores principales: Arslan, Hilal, Arslan, Hasan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Karabuk University. Publishing services by Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064761/
http://dx.doi.org/10.1016/j.jestch.2020.12.026
_version_ 1783682203470790656
author Arslan, Hilal
Arslan, Hasan
author_facet Arslan, Hilal
Arslan, Hasan
author_sort Arslan, Hilal
collection PubMed
description Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). After the rapid spread of COVID-19, many researchers have investigated diagnosis and treatment for this terrifying disease quickly. Identifying COVID-19 from the other types of coronaviruses is a difficult problem due to their genetic similarity. In this study, we propose a new efficient COVID-19 detection method based on the K-nearest neighbors (KNN) classifier using the complete genome sequences of human coronaviruses in the dataset recorded in 2019 Novel Coronavirus Resource. We also describe two features based on CpG island that efficiently detect COVID-19 cases. Thus, genome sequences including approximately 30,000 nucleotides can be represented by only two real numbers. The KNN method is a simple and effective non-parametric technique for solving classification problems. However, performance of the KNN depends on the distance measure used. We perform 19 distance metrics investigated in five categories to improve the performance of the KNN algorithm. Some efficient performance parameters are computed to evaluate the proposed method. The proposed method achieves 98.4% precision, 99.2% recall, 98.8% F-measure, and 98.4% accuracy in a few seconds when any [Formula: see text] type metric is used as a distance measure in the KNN.
format Online
Article
Text
id pubmed-8064761
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Karabuk University. Publishing services by Elsevier B.V.
record_format MEDLINE/PubMed
spelling pubmed-80647612021-04-26 A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier Arslan, Hilal Arslan, Hasan Engineering Science and Technology, an International Journal Full Length Article Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). After the rapid spread of COVID-19, many researchers have investigated diagnosis and treatment for this terrifying disease quickly. Identifying COVID-19 from the other types of coronaviruses is a difficult problem due to their genetic similarity. In this study, we propose a new efficient COVID-19 detection method based on the K-nearest neighbors (KNN) classifier using the complete genome sequences of human coronaviruses in the dataset recorded in 2019 Novel Coronavirus Resource. We also describe two features based on CpG island that efficiently detect COVID-19 cases. Thus, genome sequences including approximately 30,000 nucleotides can be represented by only two real numbers. The KNN method is a simple and effective non-parametric technique for solving classification problems. However, performance of the KNN depends on the distance measure used. We perform 19 distance metrics investigated in five categories to improve the performance of the KNN algorithm. Some efficient performance parameters are computed to evaluate the proposed method. The proposed method achieves 98.4% precision, 99.2% recall, 98.8% F-measure, and 98.4% accuracy in a few seconds when any [Formula: see text] type metric is used as a distance measure in the KNN. Karabuk University. Publishing services by Elsevier B.V. 2021-08 2021-01-09 /pmc/articles/PMC8064761/ http://dx.doi.org/10.1016/j.jestch.2020.12.026 Text en © 2020 Karabuk University Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Full Length Article
Arslan, Hilal
Arslan, Hasan
A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title_full A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title_fullStr A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title_full_unstemmed A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title_short A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
title_sort new covid-19 detection method from human genome sequences using cpg island features and knn classifier
topic Full Length Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064761/
http://dx.doi.org/10.1016/j.jestch.2020.12.026
work_keys_str_mv AT arslanhilal anewcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier
AT arslanhasan anewcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier
AT arslanhilal newcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier
AT arslanhasan newcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier