Cargando…
Treetrimmer: a method for phylogenetic dataset size reduction
BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3637088/ https://www.ncbi.nlm.nih.gov/pubmed/23587045 http://dx.doi.org/10.1186/1756-0500-6-145 |
_version_ | 1782267402882383872 |
---|---|
author | Maruyama, Shinichiro Eveleigh, Robert JM Archibald, John M |
author_facet | Maruyama, Shinichiro Eveleigh, Robert JM Archibald, John M |
author_sort | Maruyama, Shinichiro |
collection | PubMed |
description | BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual ‘pruning’ of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. FINDINGS: Here we present ‘TreeTrimmer’, a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined ‘redundant’ sequences, e.g., orthologous sequences from closely related organisms and ‘recently’ evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. CONCLUSIONS: TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion. |
format | Online Article Text |
id | pubmed-3637088 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36370882013-04-27 Treetrimmer: a method for phylogenetic dataset size reduction Maruyama, Shinichiro Eveleigh, Robert JM Archibald, John M BMC Res Notes Technical Note BACKGROUND: With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual ‘pruning’ of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. FINDINGS: Here we present ‘TreeTrimmer’, a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined ‘redundant’ sequences, e.g., orthologous sequences from closely related organisms and ‘recently’ evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. CONCLUSIONS: TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion. BioMed Central 2013-04-12 /pmc/articles/PMC3637088/ /pubmed/23587045 http://dx.doi.org/10.1186/1756-0500-6-145 Text en Copyright © 2013 Maruyama et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Maruyama, Shinichiro Eveleigh, Robert JM Archibald, John M Treetrimmer: a method for phylogenetic dataset size reduction |
title | Treetrimmer: a method for phylogenetic dataset size reduction |
title_full | Treetrimmer: a method for phylogenetic dataset size reduction |
title_fullStr | Treetrimmer: a method for phylogenetic dataset size reduction |
title_full_unstemmed | Treetrimmer: a method for phylogenetic dataset size reduction |
title_short | Treetrimmer: a method for phylogenetic dataset size reduction |
title_sort | treetrimmer: a method for phylogenetic dataset size reduction |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3637088/ https://www.ncbi.nlm.nih.gov/pubmed/23587045 http://dx.doi.org/10.1186/1756-0500-6-145 |
work_keys_str_mv | AT maruyamashinichiro treetrimmeramethodforphylogeneticdatasetsizereduction AT eveleighrobertjm treetrimmeramethodforphylogeneticdatasetsizereduction AT archibaldjohnm treetrimmeramethodforphylogeneticdatasetsizereduction |