Cargando…

FRESCo: finding regions of excess synonymous constraint in diverse viruses

BACKGROUND: The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded fun...

Descripción completa

Detalles Bibliográficos
Autores principales: Sealfon, Rachel S, Lin, Michael F, Jungreis, Irwin, Wolf, Maxim Y, Kellis, Manolis, Sabeti, Pardis C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4376164/
https://www.ncbi.nlm.nih.gov/pubmed/25853568
http://dx.doi.org/10.1186/s13059-015-0603-7
_version_ 1782363695135850496
author Sealfon, Rachel S
Lin, Michael F
Jungreis, Irwin
Wolf, Maxim Y
Kellis, Manolis
Sabeti, Pardis C
author_facet Sealfon, Rachel S
Lin, Michael F
Jungreis, Irwin
Wolf, Maxim Y
Kellis, Manolis
Sabeti, Pardis C
author_sort Sealfon, Rachel S
collection PubMed
description BACKGROUND: The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded functional elements. Protein-coding regions in viral genomes often contain overlapping RNA structural elements, reading frames, regulatory elements, microRNAs, and packaging signals. Synonymous substitutions in these regions would be selectively disfavored and thus these regions are characterized by excess synonymous constraint. Codon choice can also modulate transcriptional efficiency, translational accuracy, and protein folding. RESULTS: We developed a phylogenetic codon model-based framework, FRESCo, designed to find regions of excess synonymous constraint in short, deep alignments, such as individual viral genes across many sequenced isolates. We demonstrated the high specificity of our approach on simulated data and applied our framework to the protein-coding regions of approximately 30 distinct species of viruses with diverse genome architectures. CONCLUSIONS: FRESCo recovers known multifunctional regions in well-characterized viruses such as hepatitis B virus, poliovirus, and West Nile virus, often at a single-codon resolution, and predicts many novel functional elements overlapping viral genes, including in Lassa and Ebola viruses. In a number of viruses, the synonymously constrained regions that we identified also display conserved, stable predicted RNA structures, including putative novel elements in multiple viral species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0603-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4376164
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43761642015-03-28 FRESCo: finding regions of excess synonymous constraint in diverse viruses Sealfon, Rachel S Lin, Michael F Jungreis, Irwin Wolf, Maxim Y Kellis, Manolis Sabeti, Pardis C Genome Biol Method BACKGROUND: The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded functional elements. Protein-coding regions in viral genomes often contain overlapping RNA structural elements, reading frames, regulatory elements, microRNAs, and packaging signals. Synonymous substitutions in these regions would be selectively disfavored and thus these regions are characterized by excess synonymous constraint. Codon choice can also modulate transcriptional efficiency, translational accuracy, and protein folding. RESULTS: We developed a phylogenetic codon model-based framework, FRESCo, designed to find regions of excess synonymous constraint in short, deep alignments, such as individual viral genes across many sequenced isolates. We demonstrated the high specificity of our approach on simulated data and applied our framework to the protein-coding regions of approximately 30 distinct species of viruses with diverse genome architectures. CONCLUSIONS: FRESCo recovers known multifunctional regions in well-characterized viruses such as hepatitis B virus, poliovirus, and West Nile virus, often at a single-codon resolution, and predicts many novel functional elements overlapping viral genes, including in Lassa and Ebola viruses. In a number of viruses, the synonymously constrained regions that we identified also display conserved, stable predicted RNA structures, including putative novel elements in multiple viral species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0603-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-02-17 2015 /pmc/articles/PMC4376164/ /pubmed/25853568 http://dx.doi.org/10.1186/s13059-015-0603-7 Text en © Sealfon et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
Sealfon, Rachel S
Lin, Michael F
Jungreis, Irwin
Wolf, Maxim Y
Kellis, Manolis
Sabeti, Pardis C
FRESCo: finding regions of excess synonymous constraint in diverse viruses
title FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_full FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_fullStr FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_full_unstemmed FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_short FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_sort fresco: finding regions of excess synonymous constraint in diverse viruses
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4376164/
https://www.ncbi.nlm.nih.gov/pubmed/25853568
http://dx.doi.org/10.1186/s13059-015-0603-7
work_keys_str_mv AT sealfonrachels frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT linmichaelf frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT jungreisirwin frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT wolfmaximy frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT kellismanolis frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT sabetipardisc frescofindingregionsofexcesssynonymousconstraintindiverseviruses