Cargando…
Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance
BACKGROUND: Until recently, read lengths on the Solexa/Illumina system were too short to reliably assemble transcriptomes without a reference sequence, especially for non-model organisms. However, with read lengths up to 100 nucleotides available in the current version, an assembly without reference...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128070/ https://www.ncbi.nlm.nih.gov/pubmed/21679424 http://dx.doi.org/10.1186/1471-2164-12-317 |
_version_ | 1782207416787533824 |
---|---|
author | Feldmeyer, Barbara Wheat, Christopher W Krezdorn, Nicolas Rotter, Björn Pfenninger, Markus |
author_facet | Feldmeyer, Barbara Wheat, Christopher W Krezdorn, Nicolas Rotter, Björn Pfenninger, Markus |
author_sort | Feldmeyer, Barbara |
collection | PubMed |
description | BACKGROUND: Until recently, read lengths on the Solexa/Illumina system were too short to reliably assemble transcriptomes without a reference sequence, especially for non-model organisms. However, with read lengths up to 100 nucleotides available in the current version, an assembly without reference genome should be possible. For this study we created an EST data set for the common pond snail Radix balthica by Illumina sequencing of a normalized transcriptome. Performance of three different short read assemblers was compared with respect to: the number of contigs, their length, depth of coverage, their quality in various BLAST searches and the alignment to mitochondrial genes. RESULTS: A single sequencing run of a normalized RNA pool resulted in 16,923,850 paired end reads with median read length of 61 bases. The assemblies generated by VELVET, OASES, and SeqMan NGEN differed in the total number of contigs, contig length, the number and quality of gene hits obtained by BLAST searches against various databases, and contig performance in the mt genome comparison. While VELVET produced the highest overall number of contigs, a large fraction of these were of small size (< 200bp), and gave redundant hits in BLAST searches and the mt genome alignment. The best overall contig performance resulted from the NGEN assembly. It produced the second largest number of contigs, which on average were comparable to the OASES contigs but gave the highest number of gene hits in two out of four BLAST searches against different reference databases. A subsequent meta-assembly of the four contig sets resulted in larger contigs, less redundancy and a higher number of BLAST hits. CONCLUSION: Our results document the first de novo transcriptome assembly of a non-model species using Illumina sequencing data. We show that de novo transcriptome assembly using this approach yields results useful for downstream applications, in particular if a meta-assembly of contig sets is used to increase contig quality. These results highlight the ongoing need for improvements in assembly methodology. |
format | Online Article Text |
id | pubmed-3128070 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-31280702011-07-01 Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance Feldmeyer, Barbara Wheat, Christopher W Krezdorn, Nicolas Rotter, Björn Pfenninger, Markus BMC Genomics Research Article BACKGROUND: Until recently, read lengths on the Solexa/Illumina system were too short to reliably assemble transcriptomes without a reference sequence, especially for non-model organisms. However, with read lengths up to 100 nucleotides available in the current version, an assembly without reference genome should be possible. For this study we created an EST data set for the common pond snail Radix balthica by Illumina sequencing of a normalized transcriptome. Performance of three different short read assemblers was compared with respect to: the number of contigs, their length, depth of coverage, their quality in various BLAST searches and the alignment to mitochondrial genes. RESULTS: A single sequencing run of a normalized RNA pool resulted in 16,923,850 paired end reads with median read length of 61 bases. The assemblies generated by VELVET, OASES, and SeqMan NGEN differed in the total number of contigs, contig length, the number and quality of gene hits obtained by BLAST searches against various databases, and contig performance in the mt genome comparison. While VELVET produced the highest overall number of contigs, a large fraction of these were of small size (< 200bp), and gave redundant hits in BLAST searches and the mt genome alignment. The best overall contig performance resulted from the NGEN assembly. It produced the second largest number of contigs, which on average were comparable to the OASES contigs but gave the highest number of gene hits in two out of four BLAST searches against different reference databases. A subsequent meta-assembly of the four contig sets resulted in larger contigs, less redundancy and a higher number of BLAST hits. CONCLUSION: Our results document the first de novo transcriptome assembly of a non-model species using Illumina sequencing data. We show that de novo transcriptome assembly using this approach yields results useful for downstream applications, in particular if a meta-assembly of contig sets is used to increase contig quality. These results highlight the ongoing need for improvements in assembly methodology. BioMed Central 2011-06-16 /pmc/articles/PMC3128070/ /pubmed/21679424 http://dx.doi.org/10.1186/1471-2164-12-317 Text en Copyright ©2011 Feldmeyer et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Feldmeyer, Barbara Wheat, Christopher W Krezdorn, Nicolas Rotter, Björn Pfenninger, Markus Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title | Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title_full | Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title_fullStr | Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title_full_unstemmed | Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title_short | Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance |
title_sort | short read illumina data for the de novo assembly of a non-model snail species transcriptome (radix balthica, basommatophora, pulmonata), and a comparison of assembler performance |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128070/ https://www.ncbi.nlm.nih.gov/pubmed/21679424 http://dx.doi.org/10.1186/1471-2164-12-317 |
work_keys_str_mv | AT feldmeyerbarbara shortreadilluminadataforthedenovoassemblyofanonmodelsnailspeciestranscriptomeradixbalthicabasommatophorapulmonataandacomparisonofassemblerperformance AT wheatchristopherw shortreadilluminadataforthedenovoassemblyofanonmodelsnailspeciestranscriptomeradixbalthicabasommatophorapulmonataandacomparisonofassemblerperformance AT krezdornnicolas shortreadilluminadataforthedenovoassemblyofanonmodelsnailspeciestranscriptomeradixbalthicabasommatophorapulmonataandacomparisonofassemblerperformance AT rotterbjorn shortreadilluminadataforthedenovoassemblyofanonmodelsnailspeciestranscriptomeradixbalthicabasommatophorapulmonataandacomparisonofassemblerperformance AT pfenningermarkus shortreadilluminadataforthedenovoassemblyofanonmodelsnailspeciestranscriptomeradixbalthicabasommatophorapulmonataandacomparisonofassemblerperformance |