Cargando…

Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective

BACKGROUND: Comparative sequence analysis of the 16S rRNA gene is frequently used to characterize the microbial diversity of environmental samples. However, sequence similarities do not always imply functional or evolutionary relatedness due to many factors, including unequal rates of change and con...

Descripción completa

Detalles Bibliográficos
Autores principales: Smits, Samuel A, Ouverney, Cleber C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026365/
https://www.ncbi.nlm.nih.gov/pubmed/20946601
http://dx.doi.org/10.1186/1471-2105-11-S6-S18
_version_ 1782197036026691584
author Smits, Samuel A
Ouverney, Cleber C
author_facet Smits, Samuel A
Ouverney, Cleber C
author_sort Smits, Samuel A
collection PubMed
description BACKGROUND: Comparative sequence analysis of the 16S rRNA gene is frequently used to characterize the microbial diversity of environmental samples. However, sequence similarities do not always imply functional or evolutionary relatedness due to many factors, including unequal rates of change and convergence. Thus, relying on top BLASTN hits for phylogenetic studies may misrepresent the diversity of these constituents. Furthermore, attempts to circumvent this issue by including a large number of BLASTN hits per sequence in one tree to explore their relatedness presents other problems. For instance, the multiple sequence alignment will be poor and computationally costly if not relying on manual alignment, and it may be difficult to derive meaningful relationships from the resulting tree. Analyzing sequence relationship networks within collective BLASTN results, however, reveal sequences that are closely related despite low rank. RESULTS: We have developed a web application, Phylometrics, that relies on networks of collective BLASTN results (rather than single BLASTN hits) to facilitate the process of building phylogenetic trees in an automated, high-throughput fashion while offering novel tools to find sequences that are of significant phylogenetic interest with minimal human involvement. The application, which can be installed locally in a laboratory or hosted remotely, utilizes a simple wizard-style format to guide the user through the pipeline without necessitating a background in programming. Furthermore, Phylometrics implements an independent job queuing system that enables users to continue to use the system while jobs are run with little or no degradation in performance. CONCLUSIONS: Phylometrics provides a novel data mining method to screen supplied DNA sequences and to identify sequences that are of significant phylogenetic interest using powerful analytical tools. Sequences that are identified as being similar to a number of supplied sequences may provide key insights into their functional or evolutionary relatedness. Users require the same basic computer skills as for navigating most internet applications.
format Text
id pubmed-3026365
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30263652011-01-26 Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective Smits, Samuel A Ouverney, Cleber C BMC Bioinformatics Proceedings BACKGROUND: Comparative sequence analysis of the 16S rRNA gene is frequently used to characterize the microbial diversity of environmental samples. However, sequence similarities do not always imply functional or evolutionary relatedness due to many factors, including unequal rates of change and convergence. Thus, relying on top BLASTN hits for phylogenetic studies may misrepresent the diversity of these constituents. Furthermore, attempts to circumvent this issue by including a large number of BLASTN hits per sequence in one tree to explore their relatedness presents other problems. For instance, the multiple sequence alignment will be poor and computationally costly if not relying on manual alignment, and it may be difficult to derive meaningful relationships from the resulting tree. Analyzing sequence relationship networks within collective BLASTN results, however, reveal sequences that are closely related despite low rank. RESULTS: We have developed a web application, Phylometrics, that relies on networks of collective BLASTN results (rather than single BLASTN hits) to facilitate the process of building phylogenetic trees in an automated, high-throughput fashion while offering novel tools to find sequences that are of significant phylogenetic interest with minimal human involvement. The application, which can be installed locally in a laboratory or hosted remotely, utilizes a simple wizard-style format to guide the user through the pipeline without necessitating a background in programming. Furthermore, Phylometrics implements an independent job queuing system that enables users to continue to use the system while jobs are run with little or no degradation in performance. CONCLUSIONS: Phylometrics provides a novel data mining method to screen supplied DNA sequences and to identify sequences that are of significant phylogenetic interest using powerful analytical tools. Sequences that are identified as being similar to a number of supplied sequences may provide key insights into their functional or evolutionary relatedness. Users require the same basic computer skills as for navigating most internet applications. BioMed Central 2010-10-07 /pmc/articles/PMC3026365/ /pubmed/20946601 http://dx.doi.org/10.1186/1471-2105-11-S6-S18 Text en Copyright ©2010 Smits and Ouverney; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Smits, Samuel A
Ouverney, Cleber C
Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title_full Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title_fullStr Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title_full_unstemmed Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title_short Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
title_sort phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026365/
https://www.ncbi.nlm.nih.gov/pubmed/20946601
http://dx.doi.org/10.1186/1471-2105-11-S6-S18
work_keys_str_mv AT smitssamuela phylometricsapipelineforinferringphylogenetictreesfromasequencerelationshipnetworkperspective
AT ouverneycleberc phylometricsapipelineforinferringphylogenetictreesfromasequencerelationshipnetworkperspective