Cargando…
gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output
Unknown sequences, or gaps, are present in many published genomes across public databases. Gap filling is an important finishing step in de novo genome assembly, especially in large genomes. The gap filling problem is nontrivial and while there are many computational tools partially solving the prob...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6733440/ https://www.ncbi.nlm.nih.gov/pubmed/31498807 http://dx.doi.org/10.1371/journal.pone.0216885 |
_version_ | 1783449983445368832 |
---|---|
author | Kammonen, Juhana I. Smolander, Olli-Pekka Paulin, Lars Pereira, Pedro A. B. Laine, Pia Koskinen, Patrik Jernvall, Jukka Auvinen, Petri |
author_facet | Kammonen, Juhana I. Smolander, Olli-Pekka Paulin, Lars Pereira, Pedro A. B. Laine, Pia Koskinen, Patrik Jernvall, Jukka Auvinen, Petri |
author_sort | Kammonen, Juhana I. |
collection | PubMed |
description | Unknown sequences, or gaps, are present in many published genomes across public databases. Gap filling is an important finishing step in de novo genome assembly, especially in large genomes. The gap filling problem is nontrivial and while there are many computational tools partially solving the problem, several have shortcomings as to the reliability and correctness of the output, i.e. the gap filled draft genome. SSPACE-LongRead is a scaffolding tool that utilizes long reads from multiple third-generation sequencing platforms in finding links between contigs and combining them. The long reads potentially contain sequence information to fill the gaps created in the scaffolding, but SSPACE-LongRead currently lacks this functionality. We present an automated pipeline called gapFinisher to process SSPACE-LongRead output to fill gaps after the scaffolding. gapFinisher is based on the controlled use of a previously published gap filling tool FGAP and works on all standard Linux/UNIX command lines. We compare the performance of gapFinisher against two other published gap filling tools PBJelly and GMcloser. We conclude that gapFinisher can fill gaps in draft genomes quickly and reliably. In addition, the serial design of gapFinisher makes it scale well from prokaryote genomes to larger genomes with no increase in the computational footprint. |
format | Online Article Text |
id | pubmed-6733440 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-67334402019-09-20 gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output Kammonen, Juhana I. Smolander, Olli-Pekka Paulin, Lars Pereira, Pedro A. B. Laine, Pia Koskinen, Patrik Jernvall, Jukka Auvinen, Petri PLoS One Research Article Unknown sequences, or gaps, are present in many published genomes across public databases. Gap filling is an important finishing step in de novo genome assembly, especially in large genomes. The gap filling problem is nontrivial and while there are many computational tools partially solving the problem, several have shortcomings as to the reliability and correctness of the output, i.e. the gap filled draft genome. SSPACE-LongRead is a scaffolding tool that utilizes long reads from multiple third-generation sequencing platforms in finding links between contigs and combining them. The long reads potentially contain sequence information to fill the gaps created in the scaffolding, but SSPACE-LongRead currently lacks this functionality. We present an automated pipeline called gapFinisher to process SSPACE-LongRead output to fill gaps after the scaffolding. gapFinisher is based on the controlled use of a previously published gap filling tool FGAP and works on all standard Linux/UNIX command lines. We compare the performance of gapFinisher against two other published gap filling tools PBJelly and GMcloser. We conclude that gapFinisher can fill gaps in draft genomes quickly and reliably. In addition, the serial design of gapFinisher makes it scale well from prokaryote genomes to larger genomes with no increase in the computational footprint. Public Library of Science 2019-09-09 /pmc/articles/PMC6733440/ /pubmed/31498807 http://dx.doi.org/10.1371/journal.pone.0216885 Text en © 2019 Kammonen et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Kammonen, Juhana I. Smolander, Olli-Pekka Paulin, Lars Pereira, Pedro A. B. Laine, Pia Koskinen, Patrik Jernvall, Jukka Auvinen, Petri gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title | gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title_full | gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title_fullStr | gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title_full_unstemmed | gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title_short | gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output |
title_sort | gapfinisher: a reliable gap filling pipeline for sspace-longread scaffolder output |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6733440/ https://www.ncbi.nlm.nih.gov/pubmed/31498807 http://dx.doi.org/10.1371/journal.pone.0216885 |
work_keys_str_mv | AT kammonenjuhanai gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT smolanderollipekka gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT paulinlars gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT pereirapedroab gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT lainepia gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT koskinenpatrik gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT jernvalljukka gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput AT auvinenpetri gapfinisherareliablegapfillingpipelineforsspacelongreadscaffolderoutput |