Cargando…

FGAP: an automated gap closing tool

BACKGROUND: The fast reduction of prices of DNA sequencing allowed rapid accumulation of genome data. However, the process of obtaining complete genome sequences is still very time consuming and labor demanding. In addition, data produced from various sequencing technologies or alternative assemblie...

Descripción completa

Detalles Bibliográficos
Autores principales: Piro, Vitor C, Faoro, Helisson, Weiss, Vinicius A, Steffens, Maria BR, Pedrosa, Fabio O, Souza, Emanuel M, Raittz, Roberto T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4091766/
https://www.ncbi.nlm.nih.gov/pubmed/24938749
http://dx.doi.org/10.1186/1756-0500-7-371
_version_ 1782480801567342592
author Piro, Vitor C
Faoro, Helisson
Weiss, Vinicius A
Steffens, Maria BR
Pedrosa, Fabio O
Souza, Emanuel M
Raittz, Roberto T
author_facet Piro, Vitor C
Faoro, Helisson
Weiss, Vinicius A
Steffens, Maria BR
Pedrosa, Fabio O
Souza, Emanuel M
Raittz, Roberto T
author_sort Piro, Vitor C
collection PubMed
description BACKGROUND: The fast reduction of prices of DNA sequencing allowed rapid accumulation of genome data. However, the process of obtaining complete genome sequences is still very time consuming and labor demanding. In addition, data produced from various sequencing technologies or alternative assemblies remain underexplored to improve assembly of incomplete genome sequences. FINDINGS: We have developed FGAP, a tool for closing gaps of draft genome sequences that takes advantage of different datasets. FGAP uses BLAST to align multiple contigs against a draft genome assembly aiming to find sequences that overlap gaps. The algorithm selects the best sequence to fill and eliminate the gap. CONCLUSIONS: FGAP reduced the number of gaps by 78% in an E. coli draft genome assembly using two different sequencing technologies, Illumina and 454. Using PacBio long reads, 98% of gaps were solved. In human chromosome 14 assemblies, FGAP reduced the number of gaps by 35%. All the inserted sequences were validated with a reference genome using QUAST. The source code and a web tool are available at http://www.bioinfo.ufpr.br/fgap/.
format Online
Article
Text
id pubmed-4091766
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40917662014-07-11 FGAP: an automated gap closing tool Piro, Vitor C Faoro, Helisson Weiss, Vinicius A Steffens, Maria BR Pedrosa, Fabio O Souza, Emanuel M Raittz, Roberto T BMC Res Notes Technical Note BACKGROUND: The fast reduction of prices of DNA sequencing allowed rapid accumulation of genome data. However, the process of obtaining complete genome sequences is still very time consuming and labor demanding. In addition, data produced from various sequencing technologies or alternative assemblies remain underexplored to improve assembly of incomplete genome sequences. FINDINGS: We have developed FGAP, a tool for closing gaps of draft genome sequences that takes advantage of different datasets. FGAP uses BLAST to align multiple contigs against a draft genome assembly aiming to find sequences that overlap gaps. The algorithm selects the best sequence to fill and eliminate the gap. CONCLUSIONS: FGAP reduced the number of gaps by 78% in an E. coli draft genome assembly using two different sequencing technologies, Illumina and 454. Using PacBio long reads, 98% of gaps were solved. In human chromosome 14 assemblies, FGAP reduced the number of gaps by 35%. All the inserted sequences were validated with a reference genome using QUAST. The source code and a web tool are available at http://www.bioinfo.ufpr.br/fgap/. BioMed Central 2014-06-18 /pmc/articles/PMC4091766/ /pubmed/24938749 http://dx.doi.org/10.1186/1756-0500-7-371 Text en Copyright © 2014 Piro et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Note
Piro, Vitor C
Faoro, Helisson
Weiss, Vinicius A
Steffens, Maria BR
Pedrosa, Fabio O
Souza, Emanuel M
Raittz, Roberto T
FGAP: an automated gap closing tool
title FGAP: an automated gap closing tool
title_full FGAP: an automated gap closing tool
title_fullStr FGAP: an automated gap closing tool
title_full_unstemmed FGAP: an automated gap closing tool
title_short FGAP: an automated gap closing tool
title_sort fgap: an automated gap closing tool
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4091766/
https://www.ncbi.nlm.nih.gov/pubmed/24938749
http://dx.doi.org/10.1186/1756-0500-7-371
work_keys_str_mv AT pirovitorc fgapanautomatedgapclosingtool
AT faorohelisson fgapanautomatedgapclosingtool
AT weissviniciusa fgapanautomatedgapclosingtool
AT steffensmariabr fgapanautomatedgapclosingtool
AT pedrosafabioo fgapanautomatedgapclosingtool
AT souzaemanuelm fgapanautomatedgapclosingtool
AT raittzrobertot fgapanautomatedgapclosingtool