Cargando…

ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

BACKGROUND: Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal t...

Descripción completa

Detalles Bibliográficos
Autores principales: Fouquier, Jennifer, Rideout, Jai Ram, Bolyen, Evan, Chase, John, Shiffer, Arron, McDonald, Daniel, Knight, Rob, Caporaso, J Gregory, Kelley, Scott T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765138/
https://www.ncbi.nlm.nih.gov/pubmed/26905735
http://dx.doi.org/10.1186/s40168-016-0153-6
_version_ 1782417508747182080
author Fouquier, Jennifer
Rideout, Jai Ram
Bolyen, Evan
Chase, John
Shiffer, Arron
McDonald, Daniel
Knight, Rob
Caporaso, J Gregory
Kelley, Scott T.
author_facet Fouquier, Jennifer
Rideout, Jai Ram
Bolyen, Evan
Chase, John
Shiffer, Arron
McDonald, Daniel
Knight, Rob
Caporaso, J Gregory
Kelley, Scott T.
author_sort Fouquier, Jennifer
collection PubMed
description BACKGROUND: Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a “foundation” phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, “extension” phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new “extension tree” child. RESULTS: We applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non-phylogenetic methods for larger effect sizes. CONCLUSIONS: The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees. AVAILABILITY: ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-016-0153-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4765138
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47651382016-02-25 ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses Fouquier, Jennifer Rideout, Jai Ram Bolyen, Evan Chase, John Shiffer, Arron McDonald, Daniel Knight, Rob Caporaso, J Gregory Kelley, Scott T. Microbiome Software BACKGROUND: Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a “foundation” phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, “extension” phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new “extension tree” child. RESULTS: We applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non-phylogenetic methods for larger effect sizes. CONCLUSIONS: The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees. AVAILABILITY: ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-016-0153-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-24 /pmc/articles/PMC4765138/ /pubmed/26905735 http://dx.doi.org/10.1186/s40168-016-0153-6 Text en © Fouquier et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Fouquier, Jennifer
Rideout, Jai Ram
Bolyen, Evan
Chase, John
Shiffer, Arron
McDonald, Daniel
Knight, Rob
Caporaso, J Gregory
Kelley, Scott T.
ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title_full ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title_fullStr ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title_full_unstemmed ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title_short ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
title_sort ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765138/
https://www.ncbi.nlm.nih.gov/pubmed/26905735
http://dx.doi.org/10.1186/s40168-016-0153-6
work_keys_str_mv AT fouquierjennifer ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT rideoutjairam ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT bolyenevan ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT chasejohn ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT shifferarron ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT mcdonalddaniel ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT knightrob ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT caporasojgregory ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses
AT kelleyscottt ghosttreecreatinghybridgenephylogenetictreesfordiversityanalyses