Cargando…

Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A

MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full...

Descripción completa

Detalles Bibliográficos
Autores principales: David, Laurent, Vicedomini, Riccardo, Richard, Hugues, Carbone, Alessandra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7332565/
https://www.ncbi.nlm.nih.gov/pubmed/32330240
http://dx.doi.org/10.1093/bioinformatics/btaa272
_version_ 1783553553216831488
author David, Laurent
Vicedomini, Riccardo
Richard, Hugues
Carbone, Alessandra
author_facet David, Laurent
Vicedomini, Riccardo
Richard, Hugues
Carbone, Alessandra
author_sort David, Laurent
collection PubMed
description MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. RESULTS: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. AVAILABILITY AND IMPLEMENTATION: S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7332565
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73325652020-07-13 Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A David, Laurent Vicedomini, Riccardo Richard, Hugues Carbone, Alessandra Bioinformatics Original Papers MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. RESULTS: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. AVAILABILITY AND IMPLEMENTATION: S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-04-24 /pmc/articles/PMC7332565/ /pubmed/32330240 http://dx.doi.org/10.1093/bioinformatics/btaa272 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
David, Laurent
Vicedomini, Riccardo
Richard, Hugues
Carbone, Alessandra
Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title_full Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title_fullStr Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title_full_unstemmed Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title_short Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
title_sort targeted domain assembly for fast functional profiling of metagenomic datasets with s3a
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7332565/
https://www.ncbi.nlm.nih.gov/pubmed/32330240
http://dx.doi.org/10.1093/bioinformatics/btaa272
work_keys_str_mv AT davidlaurent targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a
AT vicedominiriccardo targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a
AT richardhugues targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a
AT carbonealessandra targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a