Cargando…

De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping

BACKGROUND: It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome)....

Descripción completa

Detalles Bibliográficos
Autores principales: Olsen, Remi-Andre, Bunikis, Ignas, Tiukova, Ievgeniia, Holmberg, Kicki, Lötstedt, Britta, Pettersson, Olga Vinnere, Passoth, Volkmar, Käller, Max, Vezzi, Francesco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4661999/
https://www.ncbi.nlm.nih.gov/pubmed/26617983
http://dx.doi.org/10.1186/s13742-015-0094-1
_version_ 1782403094038970368
author Olsen, Remi-Andre
Bunikis, Ignas
Tiukova, Ievgeniia
Holmberg, Kicki
Lötstedt, Britta
Pettersson, Olga Vinnere
Passoth, Volkmar
Käller, Max
Vezzi, Francesco
author_facet Olsen, Remi-Andre
Bunikis, Ignas
Tiukova, Ievgeniia
Holmberg, Kicki
Lötstedt, Britta
Pettersson, Olga Vinnere
Passoth, Volkmar
Käller, Max
Vezzi, Francesco
author_sort Olsen, Remi-Andre
collection PubMed
description BACKGROUND: It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. METHODS: In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. RESULTS: We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13742-015-0094-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4661999
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46619992015-11-28 De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping Olsen, Remi-Andre Bunikis, Ignas Tiukova, Ievgeniia Holmberg, Kicki Lötstedt, Britta Pettersson, Olga Vinnere Passoth, Volkmar Käller, Max Vezzi, Francesco Gigascience Technical Note BACKGROUND: It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. METHODS: In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. RESULTS: We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13742-015-0094-1) contains supplementary material, which is available to authorized users. BioMed Central 2015-11-26 /pmc/articles/PMC4661999/ /pubmed/26617983 http://dx.doi.org/10.1186/s13742-015-0094-1 Text en © Olsen et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Note
Olsen, Remi-Andre
Bunikis, Ignas
Tiukova, Ievgeniia
Holmberg, Kicki
Lötstedt, Britta
Pettersson, Olga Vinnere
Passoth, Volkmar
Käller, Max
Vezzi, Francesco
De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title_full De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title_fullStr De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title_full_unstemmed De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title_short De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
title_sort de novo assembly of dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4661999/
https://www.ncbi.nlm.nih.gov/pubmed/26617983
http://dx.doi.org/10.1186/s13742-015-0094-1
work_keys_str_mv AT olsenremiandre denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT bunikisignas denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT tiukovaievgeniia denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT holmbergkicki denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT lotstedtbritta denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT petterssonolgavinnere denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT passothvolkmar denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT kallermax denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping
AT vezzifrancesco denovoassemblyofdekkerabruxellensisamultitechnologyapproachusingshortandlongreadsequencingandopticalmapping