Cargando…
A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence al...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555736/ https://www.ncbi.nlm.nih.gov/pubmed/15757521 http://dx.doi.org/10.1186/1471-2105-6-49 |
_version_ | 1782122548062846976 |
---|---|
author | Bastien, Olivier Ortet, Philippe Roy, Sylvaine Maréchal, Eric |
author_facet | Bastien, Olivier Ortet, Philippe Roy, Sylvaine Maréchal, Eric |
author_sort | Bastien, Olivier |
collection | PubMed |
description | BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. RESULTS: We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. CONCLUSION: The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations. |
format | Text |
id | pubmed-555736 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-5557362005-04-01 A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities Bastien, Olivier Ortet, Philippe Roy, Sylvaine Maréchal, Eric BMC Bioinformatics Research Article BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. RESULTS: We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. CONCLUSION: The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations. BioMed Central 2005-03-10 /pmc/articles/PMC555736/ /pubmed/15757521 http://dx.doi.org/10.1186/1471-2105-6-49 Text en Copyright © 2005 Bastien et al; licensee BioMed Central Ltd. |
spellingShingle | Research Article Bastien, Olivier Ortet, Philippe Roy, Sylvaine Maréchal, Eric A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title | A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title_full | A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title_fullStr | A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title_full_unstemmed | A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title_short | A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities |
title_sort | configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise z-score probabilities |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555736/ https://www.ncbi.nlm.nih.gov/pubmed/15757521 http://dx.doi.org/10.1186/1471-2105-6-49 |
work_keys_str_mv | AT bastienolivier aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT ortetphilippe aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT roysylvaine aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT marechaleric aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT bastienolivier configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT ortetphilippe configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT roysylvaine configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities AT marechaleric configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities |