Cargando…

Investigating selection on viruses: a statistical alignment approach

BACKGROUND: Two problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame. Secondly, the high mutation rates we are li...

Descripción completa

Detalles Bibliográficos
Autores principales: de Groot, Saskia, Mailund, Thomas, Lunter, Gerton, Hein, Jotun
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2478691/
https://www.ncbi.nlm.nih.gov/pubmed/18616801
http://dx.doi.org/10.1186/1471-2105-9-304
_version_ 1782157619472891904
author de Groot, Saskia
Mailund, Thomas
Lunter, Gerton
Hein, Jotun
author_facet de Groot, Saskia
Mailund, Thomas
Lunter, Gerton
Hein, Jotun
author_sort de Groot, Saskia
collection PubMed
description BACKGROUND: Two problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame. Secondly, the high mutation rates we are likely to encounter complicate the inference of a reliable alignment of genomes. To address these issues, we develop a model that explicitly models selection in overlapping reading frames. We then integrate this model into a statistical alignment framework, enabling us to estimate selection while explicitly dealing with the uncertainty of individual alignments. We show that in this way we obtain un-biased selection parameters for different genomic regions of interest, and can improve in accuracy compared to using a fixed alignment. RESULTS: We run a series of simulation studies to gauge how well we do in selection estimation, especially in comparison to the use of a fixed alignment. We show that the standard practice of using a ClustalW alignment can lead to considerable biases and that estimation accuracy increases substantially when explicitly integrating over the uncertainty in inferred alignments. We even manage to compete favourably for general evolutionary distances with an alignment produced by GenAl. We subsequently run our method on HIV2 and Hepatitis B sequences. CONCLUSION: We propose that marginalizing over all alignments, as opposed to using a fixed one, should be considered in any parametric inference from divergent sequence data for which the alignments are not known with certainty. Moreover, we discover in HIV2 that double coding regions appear to be under less stringent selection than single coding ones. Additionally, there appears to be evidence for differential selection, where one overlapping reading frame is under positive and the other under negative selection.
format Text
id pubmed-2478691
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24786912008-07-22 Investigating selection on viruses: a statistical alignment approach de Groot, Saskia Mailund, Thomas Lunter, Gerton Hein, Jotun BMC Bioinformatics Research Article BACKGROUND: Two problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame. Secondly, the high mutation rates we are likely to encounter complicate the inference of a reliable alignment of genomes. To address these issues, we develop a model that explicitly models selection in overlapping reading frames. We then integrate this model into a statistical alignment framework, enabling us to estimate selection while explicitly dealing with the uncertainty of individual alignments. We show that in this way we obtain un-biased selection parameters for different genomic regions of interest, and can improve in accuracy compared to using a fixed alignment. RESULTS: We run a series of simulation studies to gauge how well we do in selection estimation, especially in comparison to the use of a fixed alignment. We show that the standard practice of using a ClustalW alignment can lead to considerable biases and that estimation accuracy increases substantially when explicitly integrating over the uncertainty in inferred alignments. We even manage to compete favourably for general evolutionary distances with an alignment produced by GenAl. We subsequently run our method on HIV2 and Hepatitis B sequences. CONCLUSION: We propose that marginalizing over all alignments, as opposed to using a fixed one, should be considered in any parametric inference from divergent sequence data for which the alignments are not known with certainty. Moreover, we discover in HIV2 that double coding regions appear to be under less stringent selection than single coding ones. Additionally, there appears to be evidence for differential selection, where one overlapping reading frame is under positive and the other under negative selection. BioMed Central 2008-07-10 /pmc/articles/PMC2478691/ /pubmed/18616801 http://dx.doi.org/10.1186/1471-2105-9-304 Text en Copyright © 2008 de Groot et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
de Groot, Saskia
Mailund, Thomas
Lunter, Gerton
Hein, Jotun
Investigating selection on viruses: a statistical alignment approach
title Investigating selection on viruses: a statistical alignment approach
title_full Investigating selection on viruses: a statistical alignment approach
title_fullStr Investigating selection on viruses: a statistical alignment approach
title_full_unstemmed Investigating selection on viruses: a statistical alignment approach
title_short Investigating selection on viruses: a statistical alignment approach
title_sort investigating selection on viruses: a statistical alignment approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2478691/
https://www.ncbi.nlm.nih.gov/pubmed/18616801
http://dx.doi.org/10.1186/1471-2105-9-304
work_keys_str_mv AT degrootsaskia investigatingselectiononvirusesastatisticalalignmentapproach
AT mailundthomas investigatingselectiononvirusesastatisticalalignmentapproach
AT luntergerton investigatingselectiononvirusesastatisticalalignmentapproach
AT heinjotun investigatingselectiononvirusesastatisticalalignmentapproach