Cargando…

Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning

ABSTRACT: The high spread rate of SARS-CoV-2 virus has put the researchers all over the world in a demanding situation. The need of the hour is to develop novel learning algorithms that can effectively learn a general pattern by training with fewer genome sequences of coronavirus. Learning from very...

Descripción completa

Detalles Bibliográficos
Autores principales: Harikrishnan, N. B., Pranay, S. Y., Nagaraj, Nithin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9170350/
https://www.ncbi.nlm.nih.gov/pubmed/35668230
http://dx.doi.org/10.1007/s11517-022-02591-3
_version_ 1784721404435890176
author Harikrishnan, N. B.
Pranay, S. Y.
Nagaraj, Nithin
author_facet Harikrishnan, N. B.
Pranay, S. Y.
Nagaraj, Nithin
author_sort Harikrishnan, N. B.
collection PubMed
description ABSTRACT: The high spread rate of SARS-CoV-2 virus has put the researchers all over the world in a demanding situation. The need of the hour is to develop novel learning algorithms that can effectively learn a general pattern by training with fewer genome sequences of coronavirus. Learning from very few training samples is necessary and important during the beginning of a disease outbreak when sequencing data is limited. This is because a successful detection and isolation of patients can curb the spread of the virus. However, this poses a huge challenge for machine learning and deep learning algorithms as they require huge amounts of training data to learn the pattern and distinguish from other closely related viruses. In this paper, we propose a new paradigm – Neurochaos Learning (NL) for classification of coronavirus genome sequence that addresses this specific problem. NL is inspired from the empirical evidence of chaos and non-linearity at the level of neurons in biological neural networks. The average sensitivity, specificity and accuracy for NL are 0.998, 0.999 and 0.998 respectively for the multiclass classification problem (SARS-CoV-2, Coronaviridae, Metapneumovirus, Rhinovirus and Influenza) using leave one out crossvalidation. With just one training sample per class for 1000 independent random trials of training, we report an average macro F1-score [Formula: see text] for the classification of SARS-CoV-2 from SARS-CoV-1 genome sequences. We compare the performance of NL with K-nearest neighbours (KNN), logistic regression, random forest, SVM, and naïve Bayes classifiers. We foresee promising future applications in genome classification using NL with novel combinations of chaotic feature engineering and other machine learning algorithms. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11517-022-02591-3.
format Online
Article
Text
id pubmed-9170350
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-91703502022-06-07 Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning Harikrishnan, N. B. Pranay, S. Y. Nagaraj, Nithin Med Biol Eng Comput Original Article ABSTRACT: The high spread rate of SARS-CoV-2 virus has put the researchers all over the world in a demanding situation. The need of the hour is to develop novel learning algorithms that can effectively learn a general pattern by training with fewer genome sequences of coronavirus. Learning from very few training samples is necessary and important during the beginning of a disease outbreak when sequencing data is limited. This is because a successful detection and isolation of patients can curb the spread of the virus. However, this poses a huge challenge for machine learning and deep learning algorithms as they require huge amounts of training data to learn the pattern and distinguish from other closely related viruses. In this paper, we propose a new paradigm – Neurochaos Learning (NL) for classification of coronavirus genome sequence that addresses this specific problem. NL is inspired from the empirical evidence of chaos and non-linearity at the level of neurons in biological neural networks. The average sensitivity, specificity and accuracy for NL are 0.998, 0.999 and 0.998 respectively for the multiclass classification problem (SARS-CoV-2, Coronaviridae, Metapneumovirus, Rhinovirus and Influenza) using leave one out crossvalidation. With just one training sample per class for 1000 independent random trials of training, we report an average macro F1-score [Formula: see text] for the classification of SARS-CoV-2 from SARS-CoV-1 genome sequences. We compare the performance of NL with K-nearest neighbours (KNN), logistic regression, random forest, SVM, and naïve Bayes classifiers. We foresee promising future applications in genome classification using NL with novel combinations of chaotic feature engineering and other machine learning algorithms. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11517-022-02591-3. Springer Berlin Heidelberg 2022-06-07 2022 /pmc/articles/PMC9170350/ /pubmed/35668230 http://dx.doi.org/10.1007/s11517-022-02591-3 Text en © International Federation for Medical and Biological Engineering 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Article
Harikrishnan, N. B.
Pranay, S. Y.
Nagaraj, Nithin
Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title_full Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title_fullStr Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title_full_unstemmed Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title_short Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning
title_sort classification of sars-cov-2 viral genome sequences using neurochaos learning
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9170350/
https://www.ncbi.nlm.nih.gov/pubmed/35668230
http://dx.doi.org/10.1007/s11517-022-02591-3
work_keys_str_mv AT harikrishnannb classificationofsarscov2viralgenomesequencesusingneurochaoslearning
AT pranaysy classificationofsarscov2viralgenomesequencesusingneurochaoslearning
AT nagarajnithin classificationofsarscov2viralgenomesequencesusingneurochaoslearning