Cargando…

Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon

BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced f...

Descripción completa

Detalles Bibliográficos
Autores principales: O'Neil, Shawn T, Dzurisin, Jason DK, Carmichael, Rory D, Lobo, Neil F, Emrich, Scott J, Hellmann, Jessica J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887415/
https://www.ncbi.nlm.nih.gov/pubmed/20478048
http://dx.doi.org/10.1186/1471-2164-11-310
_version_ 1782182548323958784
author O'Neil, Shawn T
Dzurisin, Jason DK
Carmichael, Rory D
Lobo, Neil F
Emrich, Scott J
Hellmann, Jessica J
author_facet O'Neil, Shawn T
Dzurisin, Jason DK
Carmichael, Rory D
Lobo, Neil F
Emrich, Scott J
Hellmann, Jessica J
author_sort O'Neil, Shawn T
collection PubMed
description BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task. RESULTS: Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments. Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6× on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5. CONCLUSIONS: Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene).
format Text
id pubmed-2887415
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28874152010-06-18 Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon O'Neil, Shawn T Dzurisin, Jason DK Carmichael, Rory D Lobo, Neil F Emrich, Scott J Hellmann, Jessica J BMC Genomics Research Article BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task. RESULTS: Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments. Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6× on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5. CONCLUSIONS: Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene). BioMed Central 2010-05-17 /pmc/articles/PMC2887415/ /pubmed/20478048 http://dx.doi.org/10.1186/1471-2164-11-310 Text en Copyright ©2010 O'Neil et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
O'Neil, Shawn T
Dzurisin, Jason DK
Carmichael, Rory D
Lobo, Neil F
Emrich, Scott J
Hellmann, Jessica J
Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title_full Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title_fullStr Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title_full_unstemmed Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title_short Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
title_sort population-level transcriptome sequencing of nonmodel organisms erynnis propertius and papilio zelicaon
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887415/
https://www.ncbi.nlm.nih.gov/pubmed/20478048
http://dx.doi.org/10.1186/1471-2164-11-310
work_keys_str_mv AT oneilshawnt populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon
AT dzurisinjasondk populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon
AT carmichaelroryd populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon
AT loboneilf populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon
AT emrichscottj populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon
AT hellmannjessicaj populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon