Cargando…

Comparison of next generation sequencing technologies for transcriptome characterization

BACKGROUND: We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughp...

Descripción completa

Detalles Bibliográficos
Autores principales: Wall, P Kerr, Leebens-Mack, Jim, Chanderbali, André S, Barakat, Abdelali, Wolcott, Erik, Liang, Haiying, Landherr, Lena, Tomsho, Lynn P, Hu, Yi, Carlson, John E, Ma, Hong, Schuster, Stephan C, Soltis, Douglas E, Soltis, Pamela S, Altman, Naomi, dePamphilis, Claude W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2907694/
https://www.ncbi.nlm.nih.gov/pubmed/19646272
http://dx.doi.org/10.1186/1471-2164-10-347
_version_ 1782184130467856384
author Wall, P Kerr
Leebens-Mack, Jim
Chanderbali, André S
Barakat, Abdelali
Wolcott, Erik
Liang, Haiying
Landherr, Lena
Tomsho, Lynn P
Hu, Yi
Carlson, John E
Ma, Hong
Schuster, Stephan C
Soltis, Douglas E
Soltis, Pamela S
Altman, Naomi
dePamphilis, Claude W
author_facet Wall, P Kerr
Leebens-Mack, Jim
Chanderbali, André S
Barakat, Abdelali
Wolcott, Erik
Liang, Haiying
Landherr, Lena
Tomsho, Lynn P
Hu, Yi
Carlson, John E
Ma, Hong
Schuster, Stephan C
Soltis, Douglas E
Soltis, Pamela S
Altman, Naomi
dePamphilis, Claude W
author_sort Wall, P Kerr
collection PubMed
description BACKGROUND: We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. RESULTS: The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. CONCLUSION: NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.
format Text
id pubmed-2907694
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29076942010-07-22 Comparison of next generation sequencing technologies for transcriptome characterization Wall, P Kerr Leebens-Mack, Jim Chanderbali, André S Barakat, Abdelali Wolcott, Erik Liang, Haiying Landherr, Lena Tomsho, Lynn P Hu, Yi Carlson, John E Ma, Hong Schuster, Stephan C Soltis, Douglas E Soltis, Pamela S Altman, Naomi dePamphilis, Claude W BMC Genomics Methodology Article BACKGROUND: We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. RESULTS: The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. CONCLUSION: NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. BioMed Central 2009-08-01 /pmc/articles/PMC2907694/ /pubmed/19646272 http://dx.doi.org/10.1186/1471-2164-10-347 Text en Copyright ©2009 Wall et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wall, P Kerr
Leebens-Mack, Jim
Chanderbali, André S
Barakat, Abdelali
Wolcott, Erik
Liang, Haiying
Landherr, Lena
Tomsho, Lynn P
Hu, Yi
Carlson, John E
Ma, Hong
Schuster, Stephan C
Soltis, Douglas E
Soltis, Pamela S
Altman, Naomi
dePamphilis, Claude W
Comparison of next generation sequencing technologies for transcriptome characterization
title Comparison of next generation sequencing technologies for transcriptome characterization
title_full Comparison of next generation sequencing technologies for transcriptome characterization
title_fullStr Comparison of next generation sequencing technologies for transcriptome characterization
title_full_unstemmed Comparison of next generation sequencing technologies for transcriptome characterization
title_short Comparison of next generation sequencing technologies for transcriptome characterization
title_sort comparison of next generation sequencing technologies for transcriptome characterization
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2907694/
https://www.ncbi.nlm.nih.gov/pubmed/19646272
http://dx.doi.org/10.1186/1471-2164-10-347
work_keys_str_mv AT wallpkerr comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT leebensmackjim comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT chanderbaliandres comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT barakatabdelali comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT wolcotterik comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT lianghaiying comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT landherrlena comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT tomsholynnp comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT huyi comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT carlsonjohne comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT mahong comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT schusterstephanc comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT soltisdouglase comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT soltispamelas comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT altmannaomi comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization
AT depamphilisclaudew comparisonofnextgenerationsequencingtechnologiesfortranscriptomecharacterization