Cargando…

Treetrimmer: a method for phylogenetic dataset size reduction

BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive a...

Descripción completa

Detalles Bibliográficos
Autores principales: Maruyama, Shinichiro, Eveleigh, Robert JM, Archibald, John M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3637088/
https://www.ncbi.nlm.nih.gov/pubmed/23587045
http://dx.doi.org/10.1186/1756-0500-6-145
_version_ 1782267402882383872
author Maruyama, Shinichiro
Eveleigh, Robert JM
Archibald, John M
author_facet Maruyama, Shinichiro
Eveleigh, Robert JM
Archibald, John M
author_sort Maruyama, Shinichiro
collection PubMed
description BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual ‘pruning’ of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. FINDINGS: Here we present ‘TreeTrimmer’, a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined ‘redundant’ sequences, e.g., orthologous sequences from closely related organisms and ‘recently’ evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. CONCLUSIONS: TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion.
format Online
Article
Text
id pubmed-3637088
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36370882013-04-27 Treetrimmer: a method for phylogenetic dataset size reduction Maruyama, Shinichiro Eveleigh, Robert JM Archibald, John M BMC Res Notes Technical Note BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual ‘pruning’ of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. FINDINGS: Here we present ‘TreeTrimmer’, a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined ‘redundant’ sequences, e.g., orthologous sequences from closely related organisms and ‘recently’ evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. CONCLUSIONS: TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion. BioMed Central 2013-04-12 /pmc/articles/PMC3637088/ /pubmed/23587045 http://dx.doi.org/10.1186/1756-0500-6-145 Text en Copyright © 2013 Maruyama et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Maruyama, Shinichiro
Eveleigh, Robert JM
Archibald, John M
Treetrimmer: a method for phylogenetic dataset size reduction
title Treetrimmer: a method for phylogenetic dataset size reduction
title_full Treetrimmer: a method for phylogenetic dataset size reduction
title_fullStr Treetrimmer: a method for phylogenetic dataset size reduction
title_full_unstemmed Treetrimmer: a method for phylogenetic dataset size reduction
title_short Treetrimmer: a method for phylogenetic dataset size reduction
title_sort treetrimmer: a method for phylogenetic dataset size reduction
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3637088/
https://www.ncbi.nlm.nih.gov/pubmed/23587045
http://dx.doi.org/10.1186/1756-0500-6-145
work_keys_str_mv AT maruyamashinichiro treetrimmeramethodforphylogeneticdatasetsizereduction
AT eveleighrobertjm treetrimmeramethodforphylogeneticdatasetsizereduction
AT archibaldjohnm treetrimmeramethodforphylogeneticdatasetsizereduction