Cargando…
Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms
BACKGROUND: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially fr...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4255452/ https://www.ncbi.nlm.nih.gov/pubmed/25407802 http://dx.doi.org/10.1186/s12859-014-0350-x |
_version_ | 1782347435160371200 |
---|---|
author | Speiser, Daniel I Pankey, M Sabrina Zaharoff, Alexander K Battelle, Barbara A Bracken-Grissom, Heather D Breinholt, Jesse W Bybee, Seth M Cronin, Thomas W Garm, Anders Lindgren, Annie R Patel, Nipam H Porter, Megan L Protas, Meredith E Rivera, Ajna S Serb, Jeanne M Zigler, Kirk S Crandall, Keith A Oakley, Todd H |
author_facet | Speiser, Daniel I Pankey, M Sabrina Zaharoff, Alexander K Battelle, Barbara A Bracken-Grissom, Heather D Breinholt, Jesse W Bybee, Seth M Cronin, Thomas W Garm, Anders Lindgren, Annie R Patel, Nipam H Porter, Megan L Protas, Meredith E Rivera, Ajna S Serb, Jeanne M Zigler, Kirk S Crandall, Keith A Oakley, Todd H |
author_sort | Speiser, Daniel I |
collection | PubMed |
description | BACKGROUND: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. RESULTS: We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository (http://bitbucket.org/osiris_phylogenetics/pia/) and we demonstrate PIA on a publicly-accessible web server (http://galaxy-dev.cnsi.ucsb.edu/pia/). CONCLUSIONS: Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0350-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4255452 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42554522014-12-05 Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms Speiser, Daniel I Pankey, M Sabrina Zaharoff, Alexander K Battelle, Barbara A Bracken-Grissom, Heather D Breinholt, Jesse W Bybee, Seth M Cronin, Thomas W Garm, Anders Lindgren, Annie R Patel, Nipam H Porter, Megan L Protas, Meredith E Rivera, Ajna S Serb, Jeanne M Zigler, Kirk S Crandall, Keith A Oakley, Todd H BMC Bioinformatics Software BACKGROUND: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. RESULTS: We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository (http://bitbucket.org/osiris_phylogenetics/pia/) and we demonstrate PIA on a publicly-accessible web server (http://galaxy-dev.cnsi.ucsb.edu/pia/). CONCLUSIONS: Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0350-x) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-19 /pmc/articles/PMC4255452/ /pubmed/25407802 http://dx.doi.org/10.1186/s12859-014-0350-x Text en © Speiser et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Speiser, Daniel I Pankey, M Sabrina Zaharoff, Alexander K Battelle, Barbara A Bracken-Grissom, Heather D Breinholt, Jesse W Bybee, Seth M Cronin, Thomas W Garm, Anders Lindgren, Annie R Patel, Nipam H Porter, Megan L Protas, Meredith E Rivera, Ajna S Serb, Jeanne M Zigler, Kirk S Crandall, Keith A Oakley, Todd H Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title | Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title_full | Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title_fullStr | Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title_full_unstemmed | Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title_short | Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms |
title_sort | using phylogenetically-informed annotation (pia) to search for light-interacting genes in transcriptomes from non-model organisms |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4255452/ https://www.ncbi.nlm.nih.gov/pubmed/25407802 http://dx.doi.org/10.1186/s12859-014-0350-x |
work_keys_str_mv | AT speiserdanieli usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT pankeymsabrina usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT zaharoffalexanderk usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT battellebarbaraa usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT brackengrissomheatherd usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT breinholtjessew usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT bybeesethm usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT croninthomasw usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT garmanders usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT lindgrenannier usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT patelnipamh usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT portermeganl usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT protasmeredithe usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT riveraajnas usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT serbjeannem usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT ziglerkirks usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT crandallkeitha usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms AT oakleytoddh usingphylogeneticallyinformedannotationpiatosearchforlightinteractinggenesintranscriptomesfromnonmodelorganisms |