Cargando…

Identifying featured indels associated with SARS-CoV-2 fitness

As an RNA virus, severe acute respiratory coronavirus 2 (SARS-CoV-2) is known for frequent substitution mutations, and substitutions in important genome regions are often associated with viral fitness. However, whether indel mutations are related to viral fitness is generally ignored. Here we develo...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xiang, Yan, Hongliang, Wong, Gary, Ouyang, Wanli, Cui, Jie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10580940/
https://www.ncbi.nlm.nih.gov/pubmed/37698427
http://dx.doi.org/10.1128/spectrum.02269-23
_version_ 1785122044219752448
author Li, Xiang
Yan, Hongliang
Wong, Gary
Ouyang, Wanli
Cui, Jie
author_facet Li, Xiang
Yan, Hongliang
Wong, Gary
Ouyang, Wanli
Cui, Jie
author_sort Li, Xiang
collection PubMed
description As an RNA virus, severe acute respiratory coronavirus 2 (SARS-CoV-2) is known for frequent substitution mutations, and substitutions in important genome regions are often associated with viral fitness. However, whether indel mutations are related to viral fitness is generally ignored. Here we developed a computational methodology to investigate indels linked to fitness occurring in over 9 million SARS-CoV-2 genomes. Remarkably, by analyzing 31,642,404 deletion records and 1,981,308 insertion records, our pipeline identified 26,765 deletion types and 21,054 insertion types and discovered 65 indel types with a significant association with Pango lineages. We proposed the concept of featured indels representing the population of specific Pango lineages and variants as substitution mutations and termed these 65 indels as featured indels. The selective pressure of all indel types is assessed using the Bayesian model to explore the importance of indels. Our results exhibited higher selective pressure of indels like substitution mutations, which are important for assessing viral fitness and consistent with previous studies in vitro. Evaluation of the growth rate of each viral lineage indicated that indels play key roles in SARS-CoV-2 evolution and deserve more attention as substitution mutations. IMPORTANCE: The fitness of indels in pathogen genome evolution has rarely been studied. We developed a computational methodology to investigate the severe acute respiratory coronavirus 2 genomes and analyze over 33 million records of indels systematically, ultimately proposing the concept of featured indels that can represent specific Pango lineages and identifying 65 featured indels. Machine learning model based on Bayesian inference and viral lineage growth rate evaluation suggests that these featured indels exhibit selection pressure comparable to replacement mutations. In conclusion, indels are not negligible for evaluating viral fitness.
format Online
Article
Text
id pubmed-10580940
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-105809402023-10-18 Identifying featured indels associated with SARS-CoV-2 fitness Li, Xiang Yan, Hongliang Wong, Gary Ouyang, Wanli Cui, Jie Microbiol Spectr Research Article As an RNA virus, severe acute respiratory coronavirus 2 (SARS-CoV-2) is known for frequent substitution mutations, and substitutions in important genome regions are often associated with viral fitness. However, whether indel mutations are related to viral fitness is generally ignored. Here we developed a computational methodology to investigate indels linked to fitness occurring in over 9 million SARS-CoV-2 genomes. Remarkably, by analyzing 31,642,404 deletion records and 1,981,308 insertion records, our pipeline identified 26,765 deletion types and 21,054 insertion types and discovered 65 indel types with a significant association with Pango lineages. We proposed the concept of featured indels representing the population of specific Pango lineages and variants as substitution mutations and termed these 65 indels as featured indels. The selective pressure of all indel types is assessed using the Bayesian model to explore the importance of indels. Our results exhibited higher selective pressure of indels like substitution mutations, which are important for assessing viral fitness and consistent with previous studies in vitro. Evaluation of the growth rate of each viral lineage indicated that indels play key roles in SARS-CoV-2 evolution and deserve more attention as substitution mutations. IMPORTANCE: The fitness of indels in pathogen genome evolution has rarely been studied. We developed a computational methodology to investigate the severe acute respiratory coronavirus 2 genomes and analyze over 33 million records of indels systematically, ultimately proposing the concept of featured indels that can represent specific Pango lineages and identifying 65 featured indels. Machine learning model based on Bayesian inference and viral lineage growth rate evaluation suggests that these featured indels exhibit selection pressure comparable to replacement mutations. In conclusion, indels are not negligible for evaluating viral fitness. American Society for Microbiology 2023-09-12 /pmc/articles/PMC10580940/ /pubmed/37698427 http://dx.doi.org/10.1128/spectrum.02269-23 Text en Copyright © 2023 Li et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Li, Xiang
Yan, Hongliang
Wong, Gary
Ouyang, Wanli
Cui, Jie
Identifying featured indels associated with SARS-CoV-2 fitness
title Identifying featured indels associated with SARS-CoV-2 fitness
title_full Identifying featured indels associated with SARS-CoV-2 fitness
title_fullStr Identifying featured indels associated with SARS-CoV-2 fitness
title_full_unstemmed Identifying featured indels associated with SARS-CoV-2 fitness
title_short Identifying featured indels associated with SARS-CoV-2 fitness
title_sort identifying featured indels associated with sars-cov-2 fitness
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10580940/
https://www.ncbi.nlm.nih.gov/pubmed/37698427
http://dx.doi.org/10.1128/spectrum.02269-23
work_keys_str_mv AT lixiang identifyingfeaturedindelsassociatedwithsarscov2fitness
AT yanhongliang identifyingfeaturedindelsassociatedwithsarscov2fitness
AT wonggary identifyingfeaturedindelsassociatedwithsarscov2fitness
AT ouyangwanli identifyingfeaturedindelsassociatedwithsarscov2fitness
AT cuijie identifyingfeaturedindelsassociatedwithsarscov2fitness