Cargando…
A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier
Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory s...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Karabuk University. Publishing services by Elsevier B.V.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064761/ http://dx.doi.org/10.1016/j.jestch.2020.12.026 |
_version_ | 1783682203470790656 |
---|---|
author | Arslan, Hilal Arslan, Hasan |
author_facet | Arslan, Hilal Arslan, Hasan |
author_sort | Arslan, Hilal |
collection | PubMed |
description | Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). After the rapid spread of COVID-19, many researchers have investigated diagnosis and treatment for this terrifying disease quickly. Identifying COVID-19 from the other types of coronaviruses is a difficult problem due to their genetic similarity. In this study, we propose a new efficient COVID-19 detection method based on the K-nearest neighbors (KNN) classifier using the complete genome sequences of human coronaviruses in the dataset recorded in 2019 Novel Coronavirus Resource. We also describe two features based on CpG island that efficiently detect COVID-19 cases. Thus, genome sequences including approximately 30,000 nucleotides can be represented by only two real numbers. The KNN method is a simple and effective non-parametric technique for solving classification problems. However, performance of the KNN depends on the distance measure used. We perform 19 distance metrics investigated in five categories to improve the performance of the KNN algorithm. Some efficient performance parameters are computed to evaluate the proposed method. The proposed method achieves 98.4% precision, 99.2% recall, 98.8% F-measure, and 98.4% accuracy in a few seconds when any [Formula: see text] type metric is used as a distance measure in the KNN. |
format | Online Article Text |
id | pubmed-8064761 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Karabuk University. Publishing services by Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80647612021-04-26 A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier Arslan, Hilal Arslan, Hasan Engineering Science and Technology, an International Journal Full Length Article Various viral epidemics have been detected such as the severe acute respiratory syndrome coronavirus and the Middle East respiratory syndrome coronavirus in the last two decades. The coronavirus disease 2019 (COVID-19) is a pandemic caused by a novel betacoronavirus called severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). After the rapid spread of COVID-19, many researchers have investigated diagnosis and treatment for this terrifying disease quickly. Identifying COVID-19 from the other types of coronaviruses is a difficult problem due to their genetic similarity. In this study, we propose a new efficient COVID-19 detection method based on the K-nearest neighbors (KNN) classifier using the complete genome sequences of human coronaviruses in the dataset recorded in 2019 Novel Coronavirus Resource. We also describe two features based on CpG island that efficiently detect COVID-19 cases. Thus, genome sequences including approximately 30,000 nucleotides can be represented by only two real numbers. The KNN method is a simple and effective non-parametric technique for solving classification problems. However, performance of the KNN depends on the distance measure used. We perform 19 distance metrics investigated in five categories to improve the performance of the KNN algorithm. Some efficient performance parameters are computed to evaluate the proposed method. The proposed method achieves 98.4% precision, 99.2% recall, 98.8% F-measure, and 98.4% accuracy in a few seconds when any [Formula: see text] type metric is used as a distance measure in the KNN. Karabuk University. Publishing services by Elsevier B.V. 2021-08 2021-01-09 /pmc/articles/PMC8064761/ http://dx.doi.org/10.1016/j.jestch.2020.12.026 Text en © 2020 Karabuk University Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Full Length Article Arslan, Hilal Arslan, Hasan A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title | A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title_full | A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title_fullStr | A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title_full_unstemmed | A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title_short | A new COVID-19 detection method from human genome sequences using CpG island features and KNN classifier |
title_sort | new covid-19 detection method from human genome sequences using cpg island features and knn classifier |
topic | Full Length Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064761/ http://dx.doi.org/10.1016/j.jestch.2020.12.026 |
work_keys_str_mv | AT arslanhilal anewcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier AT arslanhasan anewcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier AT arslanhilal newcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier AT arslanhasan newcovid19detectionmethodfromhumangenomesequencesusingcpgislandfeaturesandknnclassifier |