Cargando…

An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome

BACKGROUND: Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read se...

Descripción completa

Detalles Bibliográficos
Autores principales: Ferrarini, Marco, Moretto, Marco, Ward, Judson A, Šurbanovski, Nada, Stevanović, Vladimir, Giongo, Lara, Viola, Roberto, Cavalieri, Duccio, Velasco, Riccardo, Cestaro, Alessandro, Sargent, Daniel J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3853357/
https://www.ncbi.nlm.nih.gov/pubmed/24083400
http://dx.doi.org/10.1186/1471-2164-14-670
_version_ 1782478816118046720
author Ferrarini, Marco
Moretto, Marco
Ward, Judson A
Šurbanovski, Nada
Stevanović, Vladimir
Giongo, Lara
Viola, Roberto
Cavalieri, Duccio
Velasco, Riccardo
Cestaro, Alessandro
Sargent, Daniel J
author_facet Ferrarini, Marco
Moretto, Marco
Ward, Judson A
Šurbanovski, Nada
Stevanović, Vladimir
Giongo, Lara
Viola, Roberto
Cavalieri, Duccio
Velasco, Riccardo
Cestaro, Alessandro
Sargent, Daniel J
author_sort Ferrarini, Marco
collection PubMed
description BACKGROUND: Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome. RESULTS: Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously. CONCLUSIONS: This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be of immense utility for the development of genome sequence assemblies containing fewer unresolved gaps and ambiguities and a significantly smaller number of contigs than could be produced using short-read sequence data alone.
format Online
Article
Text
id pubmed-3853357
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38533572013-12-07 An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome Ferrarini, Marco Moretto, Marco Ward, Judson A Šurbanovski, Nada Stevanović, Vladimir Giongo, Lara Viola, Roberto Cavalieri, Duccio Velasco, Riccardo Cestaro, Alessandro Sargent, Daniel J BMC Genomics Research Article BACKGROUND: Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome. RESULTS: Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously. CONCLUSIONS: This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be of immense utility for the development of genome sequence assemblies containing fewer unresolved gaps and ambiguities and a significantly smaller number of contigs than could be produced using short-read sequence data alone. BioMed Central 2013-10-01 /pmc/articles/PMC3853357/ /pubmed/24083400 http://dx.doi.org/10.1186/1471-2164-14-670 Text en Copyright © 2013 Ferrarini et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ferrarini, Marco
Moretto, Marco
Ward, Judson A
Šurbanovski, Nada
Stevanović, Vladimir
Giongo, Lara
Viola, Roberto
Cavalieri, Duccio
Velasco, Riccardo
Cestaro, Alessandro
Sargent, Daniel J
An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title_full An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title_fullStr An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title_full_unstemmed An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title_short An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome
title_sort evaluation of the pacbio rs platform for sequencing and de novo assembly of a chloroplast genome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3853357/
https://www.ncbi.nlm.nih.gov/pubmed/24083400
http://dx.doi.org/10.1186/1471-2164-14-670
work_keys_str_mv AT ferrarinimarco anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT morettomarco anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT wardjudsona anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT surbanovskinada anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT stevanovicvladimir anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT giongolara anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT violaroberto anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT cavalieriduccio anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT velascoriccardo anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT cestaroalessandro anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT sargentdanielj anevaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT ferrarinimarco evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT morettomarco evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT wardjudsona evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT surbanovskinada evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT stevanovicvladimir evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT giongolara evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT violaroberto evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT cavalieriduccio evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT velascoriccardo evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT cestaroalessandro evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome
AT sargentdanielj evaluationofthepacbiorsplatformforsequencinganddenovoassemblyofachloroplastgenome