Cargando…

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

OBJECTIVE: The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecu...

Descripción completa

Detalles Bibliográficos
Autores principales: Sauvage, Thomas, Plouviez, Sophie, Schmidt, William E., Fredericq, Suzanne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838971/
https://www.ncbi.nlm.nih.gov/pubmed/29506565
http://dx.doi.org/10.1186/s13104-018-3268-y
_version_ 1783304343001235456
author Sauvage, Thomas
Plouviez, Sophie
Schmidt, William E.
Fredericq, Suzanne
author_facet Sauvage, Thomas
Plouviez, Sophie
Schmidt, William E.
Fredericq, Suzanne
author_sort Sauvage, Thomas
collection PubMed
description OBJECTIVE: The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. RESULTS: We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13104-018-3268-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5838971
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58389712018-03-09 TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees Sauvage, Thomas Plouviez, Sophie Schmidt, William E. Fredericq, Suzanne BMC Res Notes Research Note OBJECTIVE: The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. RESULTS: We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13104-018-3268-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-05 /pmc/articles/PMC5838971/ /pubmed/29506565 http://dx.doi.org/10.1186/s13104-018-3268-y Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Note
Sauvage, Thomas
Plouviez, Sophie
Schmidt, William E.
Fredericq, Suzanne
TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_full TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_fullStr TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_full_unstemmed TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_short TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_sort tree2fasta: a flexible perl script for batch extraction of fasta sequences from exploratory phylogenetic trees
topic Research Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838971/
https://www.ncbi.nlm.nih.gov/pubmed/29506565
http://dx.doi.org/10.1186/s13104-018-3268-y
work_keys_str_mv AT sauvagethomas tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees
AT plouviezsophie tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees
AT schmidtwilliame tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees
AT fredericqsuzanne tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees