Cargando…

LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites

BACKGROUND: Commonly used phylogenetic models assume a homogeneous evolutionary process throughout the tree. It is known that these homogeneous models are often too simplistic, and that with time some properties of the evolutionary process can change (due to selection or drift). In particular, as co...

Descripción completa

Detalles Bibliográficos
Autores principales: Shavit Grievink, Liat, Penny, David, Hendy, Mike D, Holland, Barbara R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613921/
https://www.ncbi.nlm.nih.gov/pubmed/19021917
http://dx.doi.org/10.1186/1471-2148-8-317
_version_ 1782163212365463552
author Shavit Grievink, Liat
Penny, David
Hendy, Mike D
Holland, Barbara R
author_facet Shavit Grievink, Liat
Penny, David
Hendy, Mike D
Holland, Barbara R
author_sort Shavit Grievink, Liat
collection PubMed
description BACKGROUND: Commonly used phylogenetic models assume a homogeneous evolutionary process throughout the tree. It is known that these homogeneous models are often too simplistic, and that with time some properties of the evolutionary process can change (due to selection or drift). In particular, as constraints on sequences evolve, the proportion of variable sites can vary between lineages. This affects the ability of phylogenetic methods to correctly estimate phylogenetic trees, especially for long timescales. To date there is no phylogenetic model that allows for change in the proportion of variable sites, and the degree to which this affects phylogenetic reconstruction is unknown. RESULTS: We present LineageSpecificSeqgen, an extension to the seq-gen program that allows generation of sequences with both changes in the proportion of variable sites and changes in the rate at which sites switch between being variable and invariable. In contrast to seq-gen and its derivatives to date, we interpret branch lengths as the mean number of substitutions per variable site, as opposed to the mean number of substitutions per site (which is averaged over all sites, including invariable sites). This allows specification of the substitution rates of variable sites, independently of the proportion of invariable sites. CONCLUSION: LineageSpecificSeqgen allows simulation of DNA and amino acid sequence alignments under a lineage-specific evolutionary process. The program can be used to test current models of evolution on sequences that have undergone lineage-specific evolution. It facilitates the development of both new methods to identify such processes in real data, and means to account for such processes. The program is available at: http://awcmee.massey.ac.nz/downloads.htm.
format Text
id pubmed-2613921
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26139212009-01-06 LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites Shavit Grievink, Liat Penny, David Hendy, Mike D Holland, Barbara R BMC Evol Biol Software BACKGROUND: Commonly used phylogenetic models assume a homogeneous evolutionary process throughout the tree. It is known that these homogeneous models are often too simplistic, and that with time some properties of the evolutionary process can change (due to selection or drift). In particular, as constraints on sequences evolve, the proportion of variable sites can vary between lineages. This affects the ability of phylogenetic methods to correctly estimate phylogenetic trees, especially for long timescales. To date there is no phylogenetic model that allows for change in the proportion of variable sites, and the degree to which this affects phylogenetic reconstruction is unknown. RESULTS: We present LineageSpecificSeqgen, an extension to the seq-gen program that allows generation of sequences with both changes in the proportion of variable sites and changes in the rate at which sites switch between being variable and invariable. In contrast to seq-gen and its derivatives to date, we interpret branch lengths as the mean number of substitutions per variable site, as opposed to the mean number of substitutions per site (which is averaged over all sites, including invariable sites). This allows specification of the substitution rates of variable sites, independently of the proportion of invariable sites. CONCLUSION: LineageSpecificSeqgen allows simulation of DNA and amino acid sequence alignments under a lineage-specific evolutionary process. The program can be used to test current models of evolution on sequences that have undergone lineage-specific evolution. It facilitates the development of both new methods to identify such processes in real data, and means to account for such processes. The program is available at: http://awcmee.massey.ac.nz/downloads.htm. BioMed Central 2008-11-21 /pmc/articles/PMC2613921/ /pubmed/19021917 http://dx.doi.org/10.1186/1471-2148-8-317 Text en Copyright ©2008 Grievink et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Shavit Grievink, Liat
Penny, David
Hendy, Mike D
Holland, Barbara R
LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title_full LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title_fullStr LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title_full_unstemmed LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title_short LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
title_sort lineagespecificseqgen: generating sequence data with lineage-specific variation in the proportion of variable sites
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613921/
https://www.ncbi.nlm.nih.gov/pubmed/19021917
http://dx.doi.org/10.1186/1471-2148-8-317
work_keys_str_mv AT shavitgrievinkliat lineagespecificseqgengeneratingsequencedatawithlineagespecificvariationintheproportionofvariablesites
AT pennydavid lineagespecificseqgengeneratingsequencedatawithlineagespecificvariationintheproportionofvariablesites
AT hendymiked lineagespecificseqgengeneratingsequencedatawithlineagespecificvariationintheproportionofvariablesites
AT hollandbarbarar lineagespecificseqgengeneratingsequencedatawithlineagespecificvariationintheproportionofvariablesites