Cargando…

GFinisher: a new strategy to refine and finish bacterial genome assemblies

Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome se...

Descripción completa

Detalles Bibliográficos
Autores principales: Guizelini, Dieval, Raittz, Roberto T., Cruz, Leonardo M., Souza, Emanuel M., Steffens, Maria B. R., Pedrosa, Fabio O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5056350/
https://www.ncbi.nlm.nih.gov/pubmed/27721396
http://dx.doi.org/10.1038/srep34963
_version_ 1782458879726059520
author Guizelini, Dieval
Raittz, Roberto T.
Cruz, Leonardo M.
Souza, Emanuel M.
Steffens, Maria B. R.
Pedrosa, Fabio O.
author_facet Guizelini, Dieval
Raittz, Roberto T.
Cruz, Leonardo M.
Souza, Emanuel M.
Steffens, Maria B. R.
Pedrosa, Fabio O.
author_sort Guizelini, Dieval
collection PubMed
description Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.
format Online
Article
Text
id pubmed-5056350
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50563502016-10-19 GFinisher: a new strategy to refine and finish bacterial genome assemblies Guizelini, Dieval Raittz, Roberto T. Cruz, Leonardo M. Souza, Emanuel M. Steffens, Maria B. R. Pedrosa, Fabio O. Sci Rep Article Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/. Nature Publishing Group 2016-10-10 /pmc/articles/PMC5056350/ /pubmed/27721396 http://dx.doi.org/10.1038/srep34963 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Guizelini, Dieval
Raittz, Roberto T.
Cruz, Leonardo M.
Souza, Emanuel M.
Steffens, Maria B. R.
Pedrosa, Fabio O.
GFinisher: a new strategy to refine and finish bacterial genome assemblies
title GFinisher: a new strategy to refine and finish bacterial genome assemblies
title_full GFinisher: a new strategy to refine and finish bacterial genome assemblies
title_fullStr GFinisher: a new strategy to refine and finish bacterial genome assemblies
title_full_unstemmed GFinisher: a new strategy to refine and finish bacterial genome assemblies
title_short GFinisher: a new strategy to refine and finish bacterial genome assemblies
title_sort gfinisher: a new strategy to refine and finish bacterial genome assemblies
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5056350/
https://www.ncbi.nlm.nih.gov/pubmed/27721396
http://dx.doi.org/10.1038/srep34963
work_keys_str_mv AT guizelinidieval gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies
AT raittzrobertot gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies
AT cruzleonardom gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies
AT souzaemanuelm gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies
AT steffensmariabr gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies
AT pedrosafabioo gfinisheranewstrategytorefineandfinishbacterialgenomeassemblies