Cargando…

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning

The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Afiahayati, Sato, Kengo, Sakakibara, Yasubumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4379979/
https://www.ncbi.nlm.nih.gov/pubmed/25431440
http://dx.doi.org/10.1093/dnares/dsu041
_version_ 1782364271028469760
author Afiahayati,
Sato, Kengo
Sakakibara, Yasubumi
author_facet Afiahayati,
Sato, Kengo
Sakakibara, Yasubumi
author_sort Afiahayati,
collection PubMed
description The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of the single-genome assembler Velvet. It has been proved to generate assemblies with higher N50 scores and higher quality than single-genome assemblers such as Velvet and SOAPdenovo when applied to metagenomic sequence reads and is frequently used in this research community. One important open problem for MetaVelvet is its low accuracy and sensitivity in detecting chimeric nodes in the assembly (de Bruijn) graph, which prevents the generation of longer contigs and scaffolds. We have tackled this problem of classifying chimeric nodes using supervised machine learning to significantly improve the performance of MetaVelvet and developed a new tool, called MetaVelvet-SL. A Support Vector Machine is used for learning the classification model based on 94 features extracted from candidate nodes. In extensive experiments, MetaVelvet-SL outperformed the original MetaVelvet and other state-of-the-art metagenomic assemblers, IDBA-UD, Ray Meta and Omega, to reconstruct accurate longer assemblies with higher N50 scores for both simulated data sets and real data sets of human gut microbial sequences.
format Online
Article
Text
id pubmed-4379979
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-43799792015-04-15 MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning Afiahayati, Sato, Kengo Sakakibara, Yasubumi DNA Res Full Papers The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of the single-genome assembler Velvet. It has been proved to generate assemblies with higher N50 scores and higher quality than single-genome assemblers such as Velvet and SOAPdenovo when applied to metagenomic sequence reads and is frequently used in this research community. One important open problem for MetaVelvet is its low accuracy and sensitivity in detecting chimeric nodes in the assembly (de Bruijn) graph, which prevents the generation of longer contigs and scaffolds. We have tackled this problem of classifying chimeric nodes using supervised machine learning to significantly improve the performance of MetaVelvet and developed a new tool, called MetaVelvet-SL. A Support Vector Machine is used for learning the classification model based on 94 features extracted from candidate nodes. In extensive experiments, MetaVelvet-SL outperformed the original MetaVelvet and other state-of-the-art metagenomic assemblers, IDBA-UD, Ray Meta and Omega, to reconstruct accurate longer assemblies with higher N50 scores for both simulated data sets and real data sets of human gut microbial sequences. Oxford University Press 2015-02 2014-11-27 /pmc/articles/PMC4379979/ /pubmed/25431440 http://dx.doi.org/10.1093/dnares/dsu041 Text en © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Full Papers
Afiahayati,
Sato, Kengo
Sakakibara, Yasubumi
MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title_full MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title_fullStr MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title_full_unstemmed MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title_short MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
title_sort metavelvet-sl: an extension of the velvet assembler to a de novo metagenomic assembler utilizing supervised learning
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4379979/
https://www.ncbi.nlm.nih.gov/pubmed/25431440
http://dx.doi.org/10.1093/dnares/dsu041
work_keys_str_mv AT afiahayati metavelvetslanextensionofthevelvetassemblertoadenovometagenomicassemblerutilizingsupervisedlearning
AT satokengo metavelvetslanextensionofthevelvetassemblertoadenovometagenomicassemblerutilizingsupervisedlearning
AT sakakibarayasubumi metavelvetslanextensionofthevelvetassemblertoadenovometagenomicassemblerutilizingsupervisedlearning