Cargando…
Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses
In December 2019, a novel human-infecting coronavirus (SARS-CoV-2) was recognized in China. In a few months, SARS-CoV-2 has caused thousands of disease cases and deaths in several countries. Phylogenetic analyses indicated that SARS-CoV-2 clusters with SARS-CoV in the Sarbecovirus subgenus and virus...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier B.V.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7199688/ https://www.ncbi.nlm.nih.gov/pubmed/32387562 http://dx.doi.org/10.1016/j.meegid.2020.104353 |
_version_ | 1783529195338465280 |
---|---|
author | Cagliani, Rachele Forni, Diego Clerici, Mario Sironi, Manuela |
author_facet | Cagliani, Rachele Forni, Diego Clerici, Mario Sironi, Manuela |
author_sort | Cagliani, Rachele |
collection | PubMed |
description | In December 2019, a novel human-infecting coronavirus (SARS-CoV-2) was recognized in China. In a few months, SARS-CoV-2 has caused thousands of disease cases and deaths in several countries. Phylogenetic analyses indicated that SARS-CoV-2 clusters with SARS-CoV in the Sarbecovirus subgenus and viruses related to SARS-CoV-2 were identified from bats and pangolins. Coronaviruses have long and complex genomes with high plasticity in terms of gene content. To date, the coding potential of SARS-CoV-2 remains partially unknown. We thus used available sequences of bat and pangolin viruses to determine the selective events that shaped the genome structure of SARS-CoV-2 and to assess its coding potential. By searching for signals of significantly reduced variability at synonymous sites (dS), we identified six genomic regions, one of these corresponding to the programmed −1 ribosomal frameshift. The most prominent signal of dS reduction was observed within the E gene. A genome-wide analysis of conserved RNA structures indicated that this region harbors a putative functional RNA element that is shared with the SARS-CoV lineage. Additional signals of reduced dS indicated the presence of internal ORFs. Whereas the presence ORF9a (internal to N) was previously proposed by homology with a well characterized protein of SARS-CoV, ORF3h (for hypothetical, within ORF3a) was not previously described. The predicted product of ORF3h has 90% identity with the corresponding predicted product of SARS-CoV and displays features suggestive of a viroporin. Finally, analysis of the putative ORF10 revealed high dN/dS (3.82) in SARS-CoV-2 and related coronaviruses. In the SARS-CoV lineage, the ORF is predicted to encode a truncated protein and is neutrally evolving. These data suggest that ORF10 encodes a functional protein in SARS-CoV-2 and that positive selection is driving its evolution. Experimental analyses will be necessary to validate and characterize the coding and non-coding functional elements we identified. |
format | Online Article Text |
id | pubmed-7199688 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-71996882020-05-06 Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses Cagliani, Rachele Forni, Diego Clerici, Mario Sironi, Manuela Infect Genet Evol Article In December 2019, a novel human-infecting coronavirus (SARS-CoV-2) was recognized in China. In a few months, SARS-CoV-2 has caused thousands of disease cases and deaths in several countries. Phylogenetic analyses indicated that SARS-CoV-2 clusters with SARS-CoV in the Sarbecovirus subgenus and viruses related to SARS-CoV-2 were identified from bats and pangolins. Coronaviruses have long and complex genomes with high plasticity in terms of gene content. To date, the coding potential of SARS-CoV-2 remains partially unknown. We thus used available sequences of bat and pangolin viruses to determine the selective events that shaped the genome structure of SARS-CoV-2 and to assess its coding potential. By searching for signals of significantly reduced variability at synonymous sites (dS), we identified six genomic regions, one of these corresponding to the programmed −1 ribosomal frameshift. The most prominent signal of dS reduction was observed within the E gene. A genome-wide analysis of conserved RNA structures indicated that this region harbors a putative functional RNA element that is shared with the SARS-CoV lineage. Additional signals of reduced dS indicated the presence of internal ORFs. Whereas the presence ORF9a (internal to N) was previously proposed by homology with a well characterized protein of SARS-CoV, ORF3h (for hypothetical, within ORF3a) was not previously described. The predicted product of ORF3h has 90% identity with the corresponding predicted product of SARS-CoV and displays features suggestive of a viroporin. Finally, analysis of the putative ORF10 revealed high dN/dS (3.82) in SARS-CoV-2 and related coronaviruses. In the SARS-CoV lineage, the ORF is predicted to encode a truncated protein and is neutrally evolving. These data suggest that ORF10 encodes a functional protein in SARS-CoV-2 and that positive selection is driving its evolution. Experimental analyses will be necessary to validate and characterize the coding and non-coding functional elements we identified. Elsevier B.V. 2020-09 2020-05-05 /pmc/articles/PMC7199688/ /pubmed/32387562 http://dx.doi.org/10.1016/j.meegid.2020.104353 Text en © 2020 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Cagliani, Rachele Forni, Diego Clerici, Mario Sironi, Manuela Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title | Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title_full | Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title_fullStr | Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title_full_unstemmed | Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title_short | Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses |
title_sort | coding potential and sequence conservation of sars-cov-2 and related animal viruses |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7199688/ https://www.ncbi.nlm.nih.gov/pubmed/32387562 http://dx.doi.org/10.1016/j.meegid.2020.104353 |
work_keys_str_mv | AT caglianirachele codingpotentialandsequenceconservationofsarscov2andrelatedanimalviruses AT fornidiego codingpotentialandsequenceconservationofsarscov2andrelatedanimalviruses AT clericimario codingpotentialandsequenceconservationofsarscov2andrelatedanimalviruses AT sironimanuela codingpotentialandsequenceconservationofsarscov2andrelatedanimalviruses |