Cargando…

Constructing a meaningful evolutionary average at the phylogenetic center of mass

BACKGROUND: As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Stone, Eric A, Sidow, Arend
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1919398/
https://www.ncbi.nlm.nih.gov/pubmed/17594490
http://dx.doi.org/10.1186/1471-2105-8-222
_version_ 1782134170596671488
author Stone, Eric A
Sidow, Arend
author_facet Stone, Eric A
Sidow, Arend
author_sort Stone, Eric A
collection PubMed
description BACKGROUND: As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, for example, sequence similarity is often used to illuminate important regions of the genome, but if the comparison is between closely related species, then similarity is the rule rather than the interesting exception. Furthermore, and perhaps worse yet, the contribution of a divergent third species may be masked by the strong similarity between the other two. Here we propose a remedy that weighs the contribution of each species according to its phylogenetic placement. RESULTS: We first solve the problem of summarizing data related by phylogeny, and we explain why an average should operate on the entire evolutionary trajectory that relates the data. This perspective leads to a new approach in which we define the average in terms of the phylogeny, using the data and a stochastic model to obtain a probability on evolutionary trajectories. With the assumption that the data evolve according to a Brownian motion process on the tree, we show that our evolutionary average can be computed as convex combination of the species data. Thus, our approach, called the BranchManager, defines both an average and a novel taxon weighting scheme. We compare the BranchManager to two other methods, demonstrating why it exhibits desirable properties. In doing so, we devise a framework for comparison and introduce the concept of a representative point at which the average is situated. CONCLUSION: The BranchManager uses as its representative point the phylogenetic center of mass, a choice which has both intuitive and practical appeal. Because our average is intrinsic to both the dataset and to the phylogeny, we expect it and its corresponding weighting scheme to be useful in all sorts of studies where interspecies data need to be combined. Obvious applications include evolutionary studies of morphology, physiology or behaviour, but quantitative measures such as sequence hydrophobicity and gene expression level are amenable to our approach as well. Other areas of potential impact include motif discovery and vaccine design. A Java implementation of the BranchManager is available for download, as is a script written in the statistical language R.
format Text
id pubmed-1919398
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19193982007-07-14 Constructing a meaningful evolutionary average at the phylogenetic center of mass Stone, Eric A Sidow, Arend BMC Bioinformatics Research Article BACKGROUND: As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, for example, sequence similarity is often used to illuminate important regions of the genome, but if the comparison is between closely related species, then similarity is the rule rather than the interesting exception. Furthermore, and perhaps worse yet, the contribution of a divergent third species may be masked by the strong similarity between the other two. Here we propose a remedy that weighs the contribution of each species according to its phylogenetic placement. RESULTS: We first solve the problem of summarizing data related by phylogeny, and we explain why an average should operate on the entire evolutionary trajectory that relates the data. This perspective leads to a new approach in which we define the average in terms of the phylogeny, using the data and a stochastic model to obtain a probability on evolutionary trajectories. With the assumption that the data evolve according to a Brownian motion process on the tree, we show that our evolutionary average can be computed as convex combination of the species data. Thus, our approach, called the BranchManager, defines both an average and a novel taxon weighting scheme. We compare the BranchManager to two other methods, demonstrating why it exhibits desirable properties. In doing so, we devise a framework for comparison and introduce the concept of a representative point at which the average is situated. CONCLUSION: The BranchManager uses as its representative point the phylogenetic center of mass, a choice which has both intuitive and practical appeal. Because our average is intrinsic to both the dataset and to the phylogeny, we expect it and its corresponding weighting scheme to be useful in all sorts of studies where interspecies data need to be combined. Obvious applications include evolutionary studies of morphology, physiology or behaviour, but quantitative measures such as sequence hydrophobicity and gene expression level are amenable to our approach as well. Other areas of potential impact include motif discovery and vaccine design. A Java implementation of the BranchManager is available for download, as is a script written in the statistical language R. BioMed Central 2007-06-26 /pmc/articles/PMC1919398/ /pubmed/17594490 http://dx.doi.org/10.1186/1471-2105-8-222 Text en Copyright © 2007 Stone and Sidow; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Stone, Eric A
Sidow, Arend
Constructing a meaningful evolutionary average at the phylogenetic center of mass
title Constructing a meaningful evolutionary average at the phylogenetic center of mass
title_full Constructing a meaningful evolutionary average at the phylogenetic center of mass
title_fullStr Constructing a meaningful evolutionary average at the phylogenetic center of mass
title_full_unstemmed Constructing a meaningful evolutionary average at the phylogenetic center of mass
title_short Constructing a meaningful evolutionary average at the phylogenetic center of mass
title_sort constructing a meaningful evolutionary average at the phylogenetic center of mass
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1919398/
https://www.ncbi.nlm.nih.gov/pubmed/17594490
http://dx.doi.org/10.1186/1471-2105-8-222
work_keys_str_mv AT stoneerica constructingameaningfulevolutionaryaverageatthephylogeneticcenterofmass
AT sidowarend constructingameaningfulevolutionaryaverageatthephylogeneticcenterofmass