Cargando…

MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data

BACKGROUND: Scaffolding is an essential step in the genome assembly process. Current methods based on large fragment paired-end reads or long reads allow an increase in contiguity but often lack consistency in repetitive regions, resulting in fragmented assemblies. Here, we describe a novel tool to...

Descripción completa

Detalles Bibliográficos
Autores principales: Madoui, Mohammed-Amin, Dossat, Carole, d’Agata, Léo, van Oeveren, Jan, van der Vossen, Edwin, Aury, Jean-Marc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4776351/
https://www.ncbi.nlm.nih.gov/pubmed/26936254
http://dx.doi.org/10.1186/s12859-016-0969-x
_version_ 1782419136638353408
author Madoui, Mohammed-Amin
Dossat, Carole
d’Agata, Léo
van Oeveren, Jan
van der Vossen, Edwin
Aury, Jean-Marc
author_facet Madoui, Mohammed-Amin
Dossat, Carole
d’Agata, Léo
van Oeveren, Jan
van der Vossen, Edwin
Aury, Jean-Marc
author_sort Madoui, Mohammed-Amin
collection PubMed
description BACKGROUND: Scaffolding is an essential step in the genome assembly process. Current methods based on large fragment paired-end reads or long reads allow an increase in contiguity but often lack consistency in repetitive regions, resulting in fragmented assemblies. Here, we describe a novel tool to link assemblies to a genome map to aid complex genome reconstruction by detecting assembly errors and allowing scaffold ordering and anchoring. RESULTS: We present MaGuS (map-guided scaffolding), a modular tool that uses a draft genome assembly, a Whole Genome Profiling™ (WGP) map, and high-throughput paired-end sequencing data to estimate the quality and to enhance the contiguity of an assembly. We generated several assemblies of the Arabidopsis genome using different scaffolding programs and applied MaGuS to select the best assembly using quality metrics. Then, we used MaGuS to perform map-guided scaffolding to increase contiguity by creating new scaffold links in low-covered and highly repetitive regions where other commonly used scaffolding methods lack consistency. CONCLUSIONS: MaGuS is a powerful reference-free evaluator of assembly quality and a WGP map-guided scaffolder that is freely available at https://github.com/institut-de-genomique/MaGuS. Its use can be extended to other high-throughput sequencing data (e.g., long-read data) and also to other map data (e.g., genetic maps) to improve the quality and the contiguity of large and complex genome assemblies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0969-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4776351
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47763512016-03-04 MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data Madoui, Mohammed-Amin Dossat, Carole d’Agata, Léo van Oeveren, Jan van der Vossen, Edwin Aury, Jean-Marc BMC Bioinformatics Methodology Article BACKGROUND: Scaffolding is an essential step in the genome assembly process. Current methods based on large fragment paired-end reads or long reads allow an increase in contiguity but often lack consistency in repetitive regions, resulting in fragmented assemblies. Here, we describe a novel tool to link assemblies to a genome map to aid complex genome reconstruction by detecting assembly errors and allowing scaffold ordering and anchoring. RESULTS: We present MaGuS (map-guided scaffolding), a modular tool that uses a draft genome assembly, a Whole Genome Profiling™ (WGP) map, and high-throughput paired-end sequencing data to estimate the quality and to enhance the contiguity of an assembly. We generated several assemblies of the Arabidopsis genome using different scaffolding programs and applied MaGuS to select the best assembly using quality metrics. Then, we used MaGuS to perform map-guided scaffolding to increase contiguity by creating new scaffold links in low-covered and highly repetitive regions where other commonly used scaffolding methods lack consistency. CONCLUSIONS: MaGuS is a powerful reference-free evaluator of assembly quality and a WGP map-guided scaffolder that is freely available at https://github.com/institut-de-genomique/MaGuS. Its use can be extended to other high-throughput sequencing data (e.g., long-read data) and also to other map data (e.g., genetic maps) to improve the quality and the contiguity of large and complex genome assemblies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0969-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-03 /pmc/articles/PMC4776351/ /pubmed/26936254 http://dx.doi.org/10.1186/s12859-016-0969-x Text en © Madoui et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Madoui, Mohammed-Amin
Dossat, Carole
d’Agata, Léo
van Oeveren, Jan
van der Vossen, Edwin
Aury, Jean-Marc
MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title_full MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title_fullStr MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title_full_unstemmed MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title_short MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data
title_sort magus: a tool for quality assessment and scaffolding of genome assemblies with whole genome profiling™ data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4776351/
https://www.ncbi.nlm.nih.gov/pubmed/26936254
http://dx.doi.org/10.1186/s12859-016-0969-x
work_keys_str_mv AT madouimohammedamin magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata
AT dossatcarole magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata
AT dagataleo magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata
AT vanoeverenjan magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata
AT vandervossenedwin magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata
AT auryjeanmarc magusatoolforqualityassessmentandscaffoldingofgenomeassemblieswithwholegenomeprofilingdata