Cargando…

A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence al...

Descripción completa

Detalles Bibliográficos
Autores principales: Bastien, Olivier, Ortet, Philippe, Roy, Sylvaine, Maréchal, Eric
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555736/
https://www.ncbi.nlm.nih.gov/pubmed/15757521
http://dx.doi.org/10.1186/1471-2105-6-49
_version_ 1782122548062846976
author Bastien, Olivier
Ortet, Philippe
Roy, Sylvaine
Maréchal, Eric
author_facet Bastien, Olivier
Ortet, Philippe
Roy, Sylvaine
Maréchal, Eric
author_sort Bastien, Olivier
collection PubMed
description BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. RESULTS: We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. CONCLUSION: The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.
format Text
id pubmed-555736
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5557362005-04-01 A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities Bastien, Olivier Ortet, Philippe Roy, Sylvaine Maréchal, Eric BMC Bioinformatics Research Article BACKGROUND: Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. RESULTS: We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. CONCLUSION: The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations. BioMed Central 2005-03-10 /pmc/articles/PMC555736/ /pubmed/15757521 http://dx.doi.org/10.1186/1471-2105-6-49 Text en Copyright © 2005 Bastien et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Bastien, Olivier
Ortet, Philippe
Roy, Sylvaine
Maréchal, Eric
A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_full A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_fullStr A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_full_unstemmed A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_short A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_sort configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise z-score probabilities
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555736/
https://www.ncbi.nlm.nih.gov/pubmed/15757521
http://dx.doi.org/10.1186/1471-2105-6-49
work_keys_str_mv AT bastienolivier aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT ortetphilippe aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT roysylvaine aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT marechaleric aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT bastienolivier configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT ortetphilippe configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT roysylvaine configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT marechaleric configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities