Cargando…
Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants
Explosively emerging SARS-CoV-2 variants challenge current nomenclature schemes based on genetic diversity and biological significance. Genomic composition-based machine learning methods have recently performed well in identifying phenotype–genotype relationships. We introduced a framework involving...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116219/ https://www.ncbi.nlm.nih.gov/pubmed/35233612 http://dx.doi.org/10.1093/bib/bbac036 |
_version_ | 1784710072312528896 |
---|---|
author | Li, Jing Wu, Ya-Nan Zhang, Sen Kang, Xiao-Ping Jiang, Tao |
author_facet | Li, Jing Wu, Ya-Nan Zhang, Sen Kang, Xiao-Ping Jiang, Tao |
author_sort | Li, Jing |
collection | PubMed |
description | Explosively emerging SARS-CoV-2 variants challenge current nomenclature schemes based on genetic diversity and biological significance. Genomic composition-based machine learning methods have recently performed well in identifying phenotype–genotype relationships. We introduced a framework involving dinucleotide (DNT) composition representation (DCR) to parse the general human adaptation of RNA viruses and applied a three-dimensional convolutional neural network (3D CNN) analysis to learn the human adaptation of other existing coronaviruses (CoVs) and predict the adaptation of SARS-CoV-2 variants of concern (VOCs). A markedly separable, linear DCR distribution was observed in two major genes—receptor-binding glycoprotein and RNA-dependent RNA polymerase (RdRp)—of six families of single-stranded (ssRNA) viruses. Additionally, there was a general host-specific distribution of both the spike proteins and RdRps of CoVs. The 3D CNN based on spike DCR predicted a dominant type II adaptation of most Beta, Delta and Omicron VOCs, with high transmissibility and low pathogenicity. Type I adaptation with opposite transmissibility and pathogenicity was predicted for SARS-CoV-2 Alpha VOCs (77%) and Kappa variants of interest (58%). The identified adaptive determinants included D1118H and A570D mutations and local DNTs. Thus, the 3D CNN model based on DCR features predicts SARS-CoV-2, a major type II human adaptation and is qualified to predict variant adaptation in real time, facilitating the risk-assessment of emerging SARS-CoV-2 variants and COVID-19 control. |
format | Online Article Text |
id | pubmed-9116219 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91162192022-05-19 Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants Li, Jing Wu, Ya-Nan Zhang, Sen Kang, Xiao-Ping Jiang, Tao Brief Bioinform Problem Solving Protocol Explosively emerging SARS-CoV-2 variants challenge current nomenclature schemes based on genetic diversity and biological significance. Genomic composition-based machine learning methods have recently performed well in identifying phenotype–genotype relationships. We introduced a framework involving dinucleotide (DNT) composition representation (DCR) to parse the general human adaptation of RNA viruses and applied a three-dimensional convolutional neural network (3D CNN) analysis to learn the human adaptation of other existing coronaviruses (CoVs) and predict the adaptation of SARS-CoV-2 variants of concern (VOCs). A markedly separable, linear DCR distribution was observed in two major genes—receptor-binding glycoprotein and RNA-dependent RNA polymerase (RdRp)—of six families of single-stranded (ssRNA) viruses. Additionally, there was a general host-specific distribution of both the spike proteins and RdRps of CoVs. The 3D CNN based on spike DCR predicted a dominant type II adaptation of most Beta, Delta and Omicron VOCs, with high transmissibility and low pathogenicity. Type I adaptation with opposite transmissibility and pathogenicity was predicted for SARS-CoV-2 Alpha VOCs (77%) and Kappa variants of interest (58%). The identified adaptive determinants included D1118H and A570D mutations and local DNTs. Thus, the 3D CNN model based on DCR features predicts SARS-CoV-2, a major type II human adaptation and is qualified to predict variant adaptation in real time, facilitating the risk-assessment of emerging SARS-CoV-2 variants and COVID-19 control. Oxford University Press 2022-03-02 /pmc/articles/PMC9116219/ /pubmed/35233612 http://dx.doi.org/10.1093/bib/bbac036 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Problem Solving Protocol Li, Jing Wu, Ya-Nan Zhang, Sen Kang, Xiao-Ping Jiang, Tao Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title | Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title_full | Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title_fullStr | Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title_full_unstemmed | Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title_short | Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants |
title_sort | deep learning based on biologically interpretable genome representation predicts two types of human adaptation of sars-cov-2 variants |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116219/ https://www.ncbi.nlm.nih.gov/pubmed/35233612 http://dx.doi.org/10.1093/bib/bbac036 |
work_keys_str_mv | AT lijing deeplearningbasedonbiologicallyinterpretablegenomerepresentationpredictstwotypesofhumanadaptationofsarscov2variants AT wuyanan deeplearningbasedonbiologicallyinterpretablegenomerepresentationpredictstwotypesofhumanadaptationofsarscov2variants AT zhangsen deeplearningbasedonbiologicallyinterpretablegenomerepresentationpredictstwotypesofhumanadaptationofsarscov2variants AT kangxiaoping deeplearningbasedonbiologicallyinterpretablegenomerepresentationpredictstwotypesofhumanadaptationofsarscov2variants AT jiangtao deeplearningbasedonbiologicallyinterpretablegenomerepresentationpredictstwotypesofhumanadaptationofsarscov2variants |