Cargando…

Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner

Driven by various mutations on the viral Spike protein, diverse variants of SARS-CoV-2 have emerged and prevailed repeatedly, significantly prolonging the pandemic. This phenomenon necessitates the identification of key Spike mutations for fitness enhancement. To address the need, this manuscript fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xin, Hu, Mingda, Liu, Bo, Xu, Huifang, Jin, Yuan, Wang, Boqian, Zhao, Yunxiang, Wu, Jun, Yue, Junjie, Ren, Hongguang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10154619/
https://www.ncbi.nlm.nih.gov/pubmed/37153142
http://dx.doi.org/10.3389/fcimb.2023.1161445
_version_ 1785036165194186752
author Wang, Xin
Hu, Mingda
Liu, Bo
Xu, Huifang
Jin, Yuan
Wang, Boqian
Zhao, Yunxiang
Wu, Jun
Yue, Junjie
Ren, Hongguang
author_facet Wang, Xin
Hu, Mingda
Liu, Bo
Xu, Huifang
Jin, Yuan
Wang, Boqian
Zhao, Yunxiang
Wu, Jun
Yue, Junjie
Ren, Hongguang
author_sort Wang, Xin
collection PubMed
description Driven by various mutations on the viral Spike protein, diverse variants of SARS-CoV-2 have emerged and prevailed repeatedly, significantly prolonging the pandemic. This phenomenon necessitates the identification of key Spike mutations for fitness enhancement. To address the need, this manuscript formulates a well-defined framework of causal inference methods for evaluating and identifying key Spike mutations to the viral fitness of SARS-CoV-2. In the context of large-scale genomes of SARS-CoV-2, it estimates the statistical contribution of mutations to viral fitness across lineages and therefore identifies important mutations. Further, identified key mutations are validated by computational methods to possess functional effects, including Spike stability, receptor-binding affinity, and potential for immune escape. Based on the effect score of each mutation, individual key fitness-enhancing mutations such as D614G and T478K are identified and studied. From individual mutations to protein domains, this paper recognizes key protein regions on the Spike protein, including the receptor-binding domain and the N-terminal domain. This research even makes further efforts to investigate viral fitness via mutational effect scores, allowing us to compute the fitness score of different SARS-CoV-2 strains and predict their transmission capacity based solely on their viral sequence. This prediction of viral fitness has been validated using BA.2.12.1, which is not used for regression training but well fits the prediction. To the best of our knowledge, this is the first research to apply causal inference models to mutational analysis on large-scale genomes of SARS-CoV-2. Our findings produce innovative and systematic insights into SARS-CoV-2 and promotes functional studies of its key mutations, serving as reliable guidance about mutations of interest.
format Online
Article
Text
id pubmed-10154619
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-101546192023-05-04 Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner Wang, Xin Hu, Mingda Liu, Bo Xu, Huifang Jin, Yuan Wang, Boqian Zhao, Yunxiang Wu, Jun Yue, Junjie Ren, Hongguang Front Cell Infect Microbiol Cellular and Infection Microbiology Driven by various mutations on the viral Spike protein, diverse variants of SARS-CoV-2 have emerged and prevailed repeatedly, significantly prolonging the pandemic. This phenomenon necessitates the identification of key Spike mutations for fitness enhancement. To address the need, this manuscript formulates a well-defined framework of causal inference methods for evaluating and identifying key Spike mutations to the viral fitness of SARS-CoV-2. In the context of large-scale genomes of SARS-CoV-2, it estimates the statistical contribution of mutations to viral fitness across lineages and therefore identifies important mutations. Further, identified key mutations are validated by computational methods to possess functional effects, including Spike stability, receptor-binding affinity, and potential for immune escape. Based on the effect score of each mutation, individual key fitness-enhancing mutations such as D614G and T478K are identified and studied. From individual mutations to protein domains, this paper recognizes key protein regions on the Spike protein, including the receptor-binding domain and the N-terminal domain. This research even makes further efforts to investigate viral fitness via mutational effect scores, allowing us to compute the fitness score of different SARS-CoV-2 strains and predict their transmission capacity based solely on their viral sequence. This prediction of viral fitness has been validated using BA.2.12.1, which is not used for regression training but well fits the prediction. To the best of our knowledge, this is the first research to apply causal inference models to mutational analysis on large-scale genomes of SARS-CoV-2. Our findings produce innovative and systematic insights into SARS-CoV-2 and promotes functional studies of its key mutations, serving as reliable guidance about mutations of interest. Frontiers Media S.A. 2023-04-19 /pmc/articles/PMC10154619/ /pubmed/37153142 http://dx.doi.org/10.3389/fcimb.2023.1161445 Text en Copyright © 2023 Wang, Hu, Liu, Xu, Jin, Wang, Zhao, Wu, Yue and Ren https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cellular and Infection Microbiology
Wang, Xin
Hu, Mingda
Liu, Bo
Xu, Huifang
Jin, Yuan
Wang, Boqian
Zhao, Yunxiang
Wu, Jun
Yue, Junjie
Ren, Hongguang
Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title_full Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title_fullStr Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title_full_unstemmed Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title_short Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner
title_sort evaluating the effect of sars-cov-2 spike mutations with a linear doubly robust learner
topic Cellular and Infection Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10154619/
https://www.ncbi.nlm.nih.gov/pubmed/37153142
http://dx.doi.org/10.3389/fcimb.2023.1161445
work_keys_str_mv AT wangxin evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT humingda evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT liubo evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT xuhuifang evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT jinyuan evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT wangboqian evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT zhaoyunxiang evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT wujun evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT yuejunjie evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner
AT renhongguang evaluatingtheeffectofsarscov2spikemutationswithalineardoublyrobustlearner