Cargando…

SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome

BACKGROUND: Third generation sequencing methods, like SMRT (Single Molecule, Real-Time) sequencing developed by Pacific Biosciences, offer much longer read length in comparison to Next Generation Sequencing (NGS) methods. Hence, they are well suited for de novo- or re-sequencing projects. Sequences...

Descripción completa

Detalles Bibliográficos
Autores principales: Stadermann, Kai Bernd, Weisshaar, Bernd, Holtgräwe, Daniela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4573686/
https://www.ncbi.nlm.nih.gov/pubmed/26377912
http://dx.doi.org/10.1186/s12859-015-0726-6
_version_ 1782390510459027456
author Stadermann, Kai Bernd
Weisshaar, Bernd
Holtgräwe, Daniela
author_facet Stadermann, Kai Bernd
Weisshaar, Bernd
Holtgräwe, Daniela
author_sort Stadermann, Kai Bernd
collection PubMed
description BACKGROUND: Third generation sequencing methods, like SMRT (Single Molecule, Real-Time) sequencing developed by Pacific Biosciences, offer much longer read length in comparison to Next Generation Sequencing (NGS) methods. Hence, they are well suited for de novo- or re-sequencing projects. Sequences generated for these purposes will not only contain reads originating from the nuclear genome, but also a significant amount of reads originating from the organelles of the target organism. These reads are usually discarded but they can also be used for an assembly of organellar replicons. The long read length supports resolution of repetitive regions and repeats within the organelles genome which might be problematic when just using short read data. Additionally, SMRT sequencing is less influenced by GC rich areas and by long stretches of the same base. RESULTS: We describe a workflow for a de novo assembly of the sugar beet (Beta vulgaris ssp. vulgaris) chloroplast genome sequence only based on data originating from a SMRT sequencing dataset targeted on its nuclear genome. We show that the data obtained from such an experiment are sufficient to create a high quality assembly with a higher reliability than assemblies derived from e.g. Illumina reads only. The chloroplast genome is especially challenging for de novo assembling as it contains two large inverted repeat (IR) regions. We also describe some limitations that still apply even though long reads are used for the assembly. CONCLUSIONS: SMRT sequencing reads extracted from a dataset created for nuclear genome (re)sequencing can be used to obtain a high quality de novo assembly of the chloroplast of the sequenced organism. Even with a relatively small overall coverage for the nuclear genome it is possible to collect more than enough reads to generate a high quality assembly that outperforms short read based assemblies. However, even with long reads it is not always possible to clarify the order of elements of a chloroplast genome sequence reliantly which we could demonstrate with Fosmid End Sequences (FES) generated with Sanger technology. Nevertheless, this limitation also applies to short read sequencing data but is reached in this case at a much earlier stage during finishing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0726-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4573686
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45736862015-09-19 SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome Stadermann, Kai Bernd Weisshaar, Bernd Holtgräwe, Daniela BMC Bioinformatics Methodology Article BACKGROUND: Third generation sequencing methods, like SMRT (Single Molecule, Real-Time) sequencing developed by Pacific Biosciences, offer much longer read length in comparison to Next Generation Sequencing (NGS) methods. Hence, they are well suited for de novo- or re-sequencing projects. Sequences generated for these purposes will not only contain reads originating from the nuclear genome, but also a significant amount of reads originating from the organelles of the target organism. These reads are usually discarded but they can also be used for an assembly of organellar replicons. The long read length supports resolution of repetitive regions and repeats within the organelles genome which might be problematic when just using short read data. Additionally, SMRT sequencing is less influenced by GC rich areas and by long stretches of the same base. RESULTS: We describe a workflow for a de novo assembly of the sugar beet (Beta vulgaris ssp. vulgaris) chloroplast genome sequence only based on data originating from a SMRT sequencing dataset targeted on its nuclear genome. We show that the data obtained from such an experiment are sufficient to create a high quality assembly with a higher reliability than assemblies derived from e.g. Illumina reads only. The chloroplast genome is especially challenging for de novo assembling as it contains two large inverted repeat (IR) regions. We also describe some limitations that still apply even though long reads are used for the assembly. CONCLUSIONS: SMRT sequencing reads extracted from a dataset created for nuclear genome (re)sequencing can be used to obtain a high quality de novo assembly of the chloroplast of the sequenced organism. Even with a relatively small overall coverage for the nuclear genome it is possible to collect more than enough reads to generate a high quality assembly that outperforms short read based assemblies. However, even with long reads it is not always possible to clarify the order of elements of a chloroplast genome sequence reliantly which we could demonstrate with Fosmid End Sequences (FES) generated with Sanger technology. Nevertheless, this limitation also applies to short read sequencing data but is reached in this case at a much earlier stage during finishing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0726-6) contains supplementary material, which is available to authorized users. BioMed Central 2015-09-16 /pmc/articles/PMC4573686/ /pubmed/26377912 http://dx.doi.org/10.1186/s12859-015-0726-6 Text en © Stadermann et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Stadermann, Kai Bernd
Weisshaar, Bernd
Holtgräwe, Daniela
SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title_full SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title_fullStr SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title_full_unstemmed SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title_short SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
title_sort smrt sequencing only de novo assembly of the sugar beet (beta vulgaris) chloroplast genome
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4573686/
https://www.ncbi.nlm.nih.gov/pubmed/26377912
http://dx.doi.org/10.1186/s12859-015-0726-6
work_keys_str_mv AT stadermannkaibernd smrtsequencingonlydenovoassemblyofthesugarbeetbetavulgarischloroplastgenome
AT weisshaarbernd smrtsequencingonlydenovoassemblyofthesugarbeetbetavulgarischloroplastgenome
AT holtgrawedaniela smrtsequencingonlydenovoassemblyofthesugarbeetbetavulgarischloroplastgenome