Cargando…
Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning
Genomic data analysis is a fundamental system for monitoring pathogen evolution and the outbreak of infectious diseases. Based on bioinformatics and deep learning, this study was designed to identify the genomic variability of SARS-CoV-2 worldwide and predict the impending mutation rate. Analysis of...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Authors. Published by Elsevier Ltd.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8598266/ https://www.ncbi.nlm.nih.gov/pubmed/34812411 http://dx.doi.org/10.1016/j.imu.2021.100798 |
_version_ | 1784600782953250816 |
---|---|
author | Hossain, Md Shahadat Pathan, A.Q.M. Sala Uddin Islam, Md Nur Tonmoy, Mahafujul Islam Quadery Rakib, Mahmudul Islam Munim, Md Adnan Saha, Otun Fariha, Atqiya Reza, Hasan Al Roy, Maitreyee Bahadur, Newaz Mohammed Rahaman, Md Mizanur |
author_facet | Hossain, Md Shahadat Pathan, A.Q.M. Sala Uddin Islam, Md Nur Tonmoy, Mahafujul Islam Quadery Rakib, Mahmudul Islam Munim, Md Adnan Saha, Otun Fariha, Atqiya Reza, Hasan Al Roy, Maitreyee Bahadur, Newaz Mohammed Rahaman, Md Mizanur |
author_sort | Hossain, Md Shahadat |
collection | PubMed |
description | Genomic data analysis is a fundamental system for monitoring pathogen evolution and the outbreak of infectious diseases. Based on bioinformatics and deep learning, this study was designed to identify the genomic variability of SARS-CoV-2 worldwide and predict the impending mutation rate. Analysis of 259044 SARS-CoV-2 isolates identified 3334545 mutations with an average of 14.01 mutations per isolate. Globally, single nucleotide polymorphism (SNP) is the most prevalent mutational event. The prevalence of C > T (52.67%) was noticed as a major alteration across the world followed by the G > T (14.59%) and A > G (11.13%). Strains from India showed the highest number of mutations (48) followed by Scotland, USA, Netherlands, Norway, and France having up to 36 mutations. D416G, F106F, P314L, UTR:C241T, L93L, A222V, A199A, V30L, and A220V mutations were found as the most frequent mutations. D1118H, S194L, R262H, M809L, P314L, A8D, S220G, A890D, G1433C, T1456I, R233C, F263S, L111K, A54T, A74V, L183A, A316T, V212F, L46C, V48G, Q57H, W131R, G172V, Q185H, and Y206S missense mutations were found to largely decrease the structural stability of the corresponding proteins. Conversely, D3L, L5F, and S97I were found to largely increase the structural stability of the corresponding proteins. Multi-nucleotide mutations GGG > AAC, CC > TT, TG > CA, and AT > TA have come up in our analysis which are in the top 20 mutational cohort. Future mutation rate analysis predicts a 17%, 7%, and 3% increment of C > T, A > G, and A > T, respectively in the future. Conversely, 7%, 7%, and 6% decrement is estimated for T > C, G > A, and G > T mutations, respectively. T > G\A, C > G\A, and A > T\C are not anticipated in the future. Since SARS-CoV-2 is mutating continuously, our findings will facilitate the tracking of mutations and help to map the progression of the COVID-19 intensity worldwide. |
format | Online Article Text |
id | pubmed-8598266 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | The Authors. Published by Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-85982662021-11-18 Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning Hossain, Md Shahadat Pathan, A.Q.M. Sala Uddin Islam, Md Nur Tonmoy, Mahafujul Islam Quadery Rakib, Mahmudul Islam Munim, Md Adnan Saha, Otun Fariha, Atqiya Reza, Hasan Al Roy, Maitreyee Bahadur, Newaz Mohammed Rahaman, Md Mizanur Inform Med Unlocked Article Genomic data analysis is a fundamental system for monitoring pathogen evolution and the outbreak of infectious diseases. Based on bioinformatics and deep learning, this study was designed to identify the genomic variability of SARS-CoV-2 worldwide and predict the impending mutation rate. Analysis of 259044 SARS-CoV-2 isolates identified 3334545 mutations with an average of 14.01 mutations per isolate. Globally, single nucleotide polymorphism (SNP) is the most prevalent mutational event. The prevalence of C > T (52.67%) was noticed as a major alteration across the world followed by the G > T (14.59%) and A > G (11.13%). Strains from India showed the highest number of mutations (48) followed by Scotland, USA, Netherlands, Norway, and France having up to 36 mutations. D416G, F106F, P314L, UTR:C241T, L93L, A222V, A199A, V30L, and A220V mutations were found as the most frequent mutations. D1118H, S194L, R262H, M809L, P314L, A8D, S220G, A890D, G1433C, T1456I, R233C, F263S, L111K, A54T, A74V, L183A, A316T, V212F, L46C, V48G, Q57H, W131R, G172V, Q185H, and Y206S missense mutations were found to largely decrease the structural stability of the corresponding proteins. Conversely, D3L, L5F, and S97I were found to largely increase the structural stability of the corresponding proteins. Multi-nucleotide mutations GGG > AAC, CC > TT, TG > CA, and AT > TA have come up in our analysis which are in the top 20 mutational cohort. Future mutation rate analysis predicts a 17%, 7%, and 3% increment of C > T, A > G, and A > T, respectively in the future. Conversely, 7%, 7%, and 6% decrement is estimated for T > C, G > A, and G > T mutations, respectively. T > G\A, C > G\A, and A > T\C are not anticipated in the future. Since SARS-CoV-2 is mutating continuously, our findings will facilitate the tracking of mutations and help to map the progression of the COVID-19 intensity worldwide. The Authors. Published by Elsevier Ltd. 2021 2021-11-18 /pmc/articles/PMC8598266/ /pubmed/34812411 http://dx.doi.org/10.1016/j.imu.2021.100798 Text en © 2022 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Hossain, Md Shahadat Pathan, A.Q.M. Sala Uddin Islam, Md Nur Tonmoy, Mahafujul Islam Quadery Rakib, Mahmudul Islam Munim, Md Adnan Saha, Otun Fariha, Atqiya Reza, Hasan Al Roy, Maitreyee Bahadur, Newaz Mohammed Rahaman, Md Mizanur Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title | Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title_full | Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title_fullStr | Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title_full_unstemmed | Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title_short | Genome-wide identification and prediction of SARS-CoV-2 mutations show an abundance of variants: Integrated study of bioinformatics and deep neural learning |
title_sort | genome-wide identification and prediction of sars-cov-2 mutations show an abundance of variants: integrated study of bioinformatics and deep neural learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8598266/ https://www.ncbi.nlm.nih.gov/pubmed/34812411 http://dx.doi.org/10.1016/j.imu.2021.100798 |
work_keys_str_mv | AT hossainmdshahadat genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT pathanaqmsalauddin genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT islammdnur genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT tonmoymahafujulislamquadery genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT rakibmahmudulislam genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT munimmdadnan genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT sahaotun genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT farihaatqiya genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT rezahasanal genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT roymaitreyee genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT bahadurnewazmohammed genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning AT rahamanmdmizanur genomewideidentificationandpredictionofsarscov2mutationsshowanabundanceofvariantsintegratedstudyofbioinformaticsanddeepneurallearning |