Cargando…

Exploration of phylogenetic data using a global sequence analysis method

BACKGROUND: Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. RESULTS: Here we explore t...

Descripción completa

Detalles Bibliográficos
Autores principales: Chapus, Charles, Dufraigne, Christine, Edwards, Scott, Giron, Alain, Fertil, Bernard, Deschavanne, Patrick
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1310607/
https://www.ncbi.nlm.nih.gov/pubmed/16280081
http://dx.doi.org/10.1186/1471-2148-5-63
_version_ 1782126311180861440
author Chapus, Charles
Dufraigne, Christine
Edwards, Scott
Giron, Alain
Fertil, Bernard
Deschavanne, Patrick
author_facet Chapus, Charles
Dufraigne, Christine
Edwards, Scott
Giron, Alain
Fertil, Bernard
Deschavanne, Patrick
author_sort Chapus, Charles
collection PubMed
description BACKGROUND: Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. RESULTS: Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 γ-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. CONCLUSION: The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis.
format Text
id pubmed-1310607
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-13106072005-12-10 Exploration of phylogenetic data using a global sequence analysis method Chapus, Charles Dufraigne, Christine Edwards, Scott Giron, Alain Fertil, Bernard Deschavanne, Patrick BMC Evol Biol Methodology Article BACKGROUND: Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. RESULTS: Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 γ-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. CONCLUSION: The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis. BioMed Central 2005-11-09 /pmc/articles/PMC1310607/ /pubmed/16280081 http://dx.doi.org/10.1186/1471-2148-5-63 Text en Copyright © 2005 Chapus et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chapus, Charles
Dufraigne, Christine
Edwards, Scott
Giron, Alain
Fertil, Bernard
Deschavanne, Patrick
Exploration of phylogenetic data using a global sequence analysis method
title Exploration of phylogenetic data using a global sequence analysis method
title_full Exploration of phylogenetic data using a global sequence analysis method
title_fullStr Exploration of phylogenetic data using a global sequence analysis method
title_full_unstemmed Exploration of phylogenetic data using a global sequence analysis method
title_short Exploration of phylogenetic data using a global sequence analysis method
title_sort exploration of phylogenetic data using a global sequence analysis method
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1310607/
https://www.ncbi.nlm.nih.gov/pubmed/16280081
http://dx.doi.org/10.1186/1471-2148-5-63
work_keys_str_mv AT chapuscharles explorationofphylogeneticdatausingaglobalsequenceanalysismethod
AT dufraignechristine explorationofphylogeneticdatausingaglobalsequenceanalysismethod
AT edwardsscott explorationofphylogeneticdatausingaglobalsequenceanalysismethod
AT gironalain explorationofphylogeneticdatausingaglobalsequenceanalysismethod
AT fertilbernard explorationofphylogeneticdatausingaglobalsequenceanalysismethod
AT deschavannepatrick explorationofphylogeneticdatausingaglobalsequenceanalysismethod