Cargando…
Deep Learning for SARS COV-2 Genome Sequences
The SARS-CoV-2 virus which originated in Wuhan, China has since spread throughout the world and is affecting millions of people. When there is a novel virus outbreak, it is crucial to quickly determine if the epidemic is a result of the novel virus or a well-known virus. We propose a deep learning a...
Formato: | Online Artículo Texto |
---|---|
Lenguaje: | English |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545213/ https://www.ncbi.nlm.nih.gov/pubmed/34812391 http://dx.doi.org/10.1109/ACCESS.2021.3073728 |
_version_ | 1784589969675780096 |
---|---|
collection | PubMed |
description | The SARS-CoV-2 virus which originated in Wuhan, China has since spread throughout the world and is affecting millions of people. When there is a novel virus outbreak, it is crucial to quickly determine if the epidemic is a result of the novel virus or a well-known virus. We propose a deep learning algorithm that uses a convolutional neural network (CNN) as well as a bi-directional long short-term memory (Bi-LSTM) neural network, for the classification of the severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) amongst Coronaviruses. Besides, we classify whether a genome sequence contains candidate regulatory motifs or otherwise. Regulatory motifs bind to transcription factors. Transcription factors are responsible for the expression of genes. The experimental results show that at peak performance, the proposed convolutional neural network bi-directional long short-term memory (CNN-Bi-LSTM) model achieves a classification accuracy of 99.95%, area under curve receiver operating characteristic (AUC ROC) of 100.00%, a specificity of 99.97%, the sensitivity of 99.97%, Cohen’s Kappa equal to 0.9978, Mathews Correlation Coefficient (MCC) equal to 0.9978 for the classification of SARS CoV-2 amongst Coronaviruses. Also, the CNN-Bi-LSTM correctly detects whether a sequence has candidate regulatory motifs or binding-sites with a classification accuracy of 99.76%, AUC ROC of 100.00%, a specificity of 99.76%, a sensitivity of 99.76%, MCC equal to 0.9980, and Cohen’s Kappa of 0.9970 at peak performance. These results are encouraging enough to recognise deep learning algorithms as alternative avenues for detecting SARS CoV-2 as well as detecting regulatory motifs in the SARS CoV-2 genes. |
format | Online Article Text |
id | pubmed-8545213 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | IEEE |
record_format | MEDLINE/PubMed |
spelling | pubmed-85452132021-11-18 Deep Learning for SARS COV-2 Genome Sequences IEEE Access Computational and artificial intelligence The SARS-CoV-2 virus which originated in Wuhan, China has since spread throughout the world and is affecting millions of people. When there is a novel virus outbreak, it is crucial to quickly determine if the epidemic is a result of the novel virus or a well-known virus. We propose a deep learning algorithm that uses a convolutional neural network (CNN) as well as a bi-directional long short-term memory (Bi-LSTM) neural network, for the classification of the severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) amongst Coronaviruses. Besides, we classify whether a genome sequence contains candidate regulatory motifs or otherwise. Regulatory motifs bind to transcription factors. Transcription factors are responsible for the expression of genes. The experimental results show that at peak performance, the proposed convolutional neural network bi-directional long short-term memory (CNN-Bi-LSTM) model achieves a classification accuracy of 99.95%, area under curve receiver operating characteristic (AUC ROC) of 100.00%, a specificity of 99.97%, the sensitivity of 99.97%, Cohen’s Kappa equal to 0.9978, Mathews Correlation Coefficient (MCC) equal to 0.9978 for the classification of SARS CoV-2 amongst Coronaviruses. Also, the CNN-Bi-LSTM correctly detects whether a sequence has candidate regulatory motifs or binding-sites with a classification accuracy of 99.76%, AUC ROC of 100.00%, a specificity of 99.76%, a sensitivity of 99.76%, MCC equal to 0.9980, and Cohen’s Kappa of 0.9970 at peak performance. These results are encouraging enough to recognise deep learning algorithms as alternative avenues for detecting SARS CoV-2 as well as detecting regulatory motifs in the SARS CoV-2 genes. IEEE 2021-04-16 /pmc/articles/PMC8545213/ /pubmed/34812391 http://dx.doi.org/10.1109/ACCESS.2021.3073728 Text en This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Computational and artificial intelligence Deep Learning for SARS COV-2 Genome Sequences |
title | Deep Learning for SARS COV-2 Genome Sequences |
title_full | Deep Learning for SARS COV-2 Genome Sequences |
title_fullStr | Deep Learning for SARS COV-2 Genome Sequences |
title_full_unstemmed | Deep Learning for SARS COV-2 Genome Sequences |
title_short | Deep Learning for SARS COV-2 Genome Sequences |
title_sort | deep learning for sars cov-2 genome sequences |
topic | Computational and artificial intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545213/ https://www.ncbi.nlm.nih.gov/pubmed/34812391 http://dx.doi.org/10.1109/ACCESS.2021.3073728 |
work_keys_str_mv | AT deeplearningforsarscov2genomesequences AT deeplearningforsarscov2genomesequences |