Cargando…
Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon
BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced f...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887415/ https://www.ncbi.nlm.nih.gov/pubmed/20478048 http://dx.doi.org/10.1186/1471-2164-11-310 |
_version_ | 1782182548323958784 |
---|---|
author | O'Neil, Shawn T Dzurisin, Jason DK Carmichael, Rory D Lobo, Neil F Emrich, Scott J Hellmann, Jessica J |
author_facet | O'Neil, Shawn T Dzurisin, Jason DK Carmichael, Rory D Lobo, Neil F Emrich, Scott J Hellmann, Jessica J |
author_sort | O'Neil, Shawn T |
collection | PubMed |
description | BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task. RESULTS: Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments. Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6× on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5. CONCLUSIONS: Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene). |
format | Text |
id | pubmed-2887415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-28874152010-06-18 Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon O'Neil, Shawn T Dzurisin, Jason DK Carmichael, Rory D Lobo, Neil F Emrich, Scott J Hellmann, Jessica J BMC Genomics Research Article BACKGROUND: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task. RESULTS: Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments. Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6× on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5. CONCLUSIONS: Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene). BioMed Central 2010-05-17 /pmc/articles/PMC2887415/ /pubmed/20478048 http://dx.doi.org/10.1186/1471-2164-11-310 Text en Copyright ©2010 O'Neil et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article O'Neil, Shawn T Dzurisin, Jason DK Carmichael, Rory D Lobo, Neil F Emrich, Scott J Hellmann, Jessica J Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title | Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title_full | Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title_fullStr | Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title_full_unstemmed | Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title_short | Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon |
title_sort | population-level transcriptome sequencing of nonmodel organisms erynnis propertius and papilio zelicaon |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887415/ https://www.ncbi.nlm.nih.gov/pubmed/20478048 http://dx.doi.org/10.1186/1471-2164-11-310 |
work_keys_str_mv | AT oneilshawnt populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon AT dzurisinjasondk populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon AT carmichaelroryd populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon AT loboneilf populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon AT emrichscottj populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon AT hellmannjessicaj populationleveltranscriptomesequencingofnonmodelorganismserynnispropertiusandpapiliozelicaon |