Cargando…

Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa

INTRODUCTION: Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence...

Descripción completa

Detalles Bibliográficos
Autores principales:	Riesgo, Ana, Andrade, Sónia C S, Sharma, Prashant P, Novo, Marta, Pérez-Porro, Alicia R, Vahtera, Varpu, González, Vanessa L, Kawauchi, Gisele Y, Giribet, Gonzalo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3538665/ https://www.ncbi.nlm.nih.gov/pubmed/23190771 http://dx.doi.org/10.1186/1742-9994-9-33

_version_	1782254988400001024
author	Riesgo, Ana Andrade, Sónia C S Sharma, Prashant P Novo, Marta Pérez-Porro, Alicia R Vahtera, Varpu González, Vanessa L Kawauchi, Gisele Y Giribet, Gonzalo
author_facet	Riesgo, Ana Andrade, Sónia C S Sharma, Prashant P Novo, Marta Pérez-Porro, Alicia R Vahtera, Varpu González, Vanessa L Kawauchi, Gisele Y Giribet, Gonzalo
author_sort	Riesgo, Ana
collection	PubMed
description	INTRODUCTION: Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. RESULTS: cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. CONCLUSIONS: We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry.
format	Online Article Text
id	pubmed-3538665
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35386652013-01-10 Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa Riesgo, Ana Andrade, Sónia C S Sharma, Prashant P Novo, Marta Pérez-Porro, Alicia R Vahtera, Varpu González, Vanessa L Kawauchi, Gisele Y Giribet, Gonzalo Front Zool Research INTRODUCTION: Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. RESULTS: cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. CONCLUSIONS: We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry. BioMed Central 2012-11-29 /pmc/articles/PMC3538665/ /pubmed/23190771 http://dx.doi.org/10.1186/1742-9994-9-33 Text en Copyright ©2012 Riesgo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Riesgo, Ana Andrade, Sónia C S Sharma, Prashant P Novo, Marta Pérez-Porro, Alicia R Vahtera, Varpu González, Vanessa L Kawauchi, Gisele Y Giribet, Gonzalo Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title	Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title_full	Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title_fullStr	Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title_full_unstemmed	Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title_short	Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
title_sort	comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3538665/ https://www.ncbi.nlm.nih.gov/pubmed/23190771 http://dx.doi.org/10.1186/1742-9994-9-33
work_keys_str_mv	AT riesgoana comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT andradesoniacs comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT sharmaprashantp comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT novomarta comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT perezporroaliciar comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT vahteravarpu comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT gonzalezvanessal comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT kawauchigiseley comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa AT giribetgonzalo comparativedescriptionoftentranscriptomesofnewlysequencedinvertebratesandefficiencyestimationofgenomicsamplinginnonmodeltaxa

Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa

Ejemplares similares