Cargando…
Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A
MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7332565/ https://www.ncbi.nlm.nih.gov/pubmed/32330240 http://dx.doi.org/10.1093/bioinformatics/btaa272 |
_version_ | 1783553553216831488 |
---|---|
author | David, Laurent Vicedomini, Riccardo Richard, Hugues Carbone, Alessandra |
author_facet | David, Laurent Vicedomini, Riccardo Richard, Hugues Carbone, Alessandra |
author_sort | David, Laurent |
collection | PubMed |
description | MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. RESULTS: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. AVAILABILITY AND IMPLEMENTATION: S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7332565 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73325652020-07-13 Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A David, Laurent Vicedomini, Riccardo Richard, Hugues Carbone, Alessandra Bioinformatics Original Papers MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly ‘explore’ the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. RESULTS: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains—a typical scenario—S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. AVAILABILITY AND IMPLEMENTATION: S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-04-24 /pmc/articles/PMC7332565/ /pubmed/32330240 http://dx.doi.org/10.1093/bioinformatics/btaa272 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers David, Laurent Vicedomini, Riccardo Richard, Hugues Carbone, Alessandra Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title | Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title_full | Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title_fullStr | Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title_full_unstemmed | Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title_short | Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A |
title_sort | targeted domain assembly for fast functional profiling of metagenomic datasets with s3a |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7332565/ https://www.ncbi.nlm.nih.gov/pubmed/32330240 http://dx.doi.org/10.1093/bioinformatics/btaa272 |
work_keys_str_mv | AT davidlaurent targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a AT vicedominiriccardo targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a AT richardhugues targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a AT carbonealessandra targeteddomainassemblyforfastfunctionalprofilingofmetagenomicdatasetswiths3a |