Cargando…

DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework

BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Benavides, Andres, Sanchez, Friman, Alzate, Juan F., Cabarcas, Felipe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474881/
https://www.ncbi.nlm.nih.gov/pubmed/32953263
http://dx.doi.org/10.7717/peerj.9762
_version_ 1783579408695558144
author Benavides, Andres
Sanchez, Friman
Alzate, Juan F.
Cabarcas, Felipe
author_facet Benavides, Andres
Sanchez, Friman
Alzate, Juan F.
Cabarcas, Felipe
author_sort Benavides, Andres
collection PubMed
description BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. METHOD: DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. RESULTS: We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. CONCLUSIONS: DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA.
format Online
Article
Text
id pubmed-7474881
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-74748812020-09-18 DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework Benavides, Andres Sanchez, Friman Alzate, Juan F. Cabarcas, Felipe PeerJ Bioinformatics BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. METHOD: DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. RESULTS: We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. CONCLUSIONS: DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA. PeerJ Inc. 2020-09-03 /pmc/articles/PMC7474881/ /pubmed/32953263 http://dx.doi.org/10.7717/peerj.9762 Text en ©2020 Benavides et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Benavides, Andres
Sanchez, Friman
Alzate, Juan F.
Cabarcas, Felipe
DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title_full DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title_fullStr DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title_full_unstemmed DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title_short DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
title_sort datma: distributed automatic metagenomic assembly and annotation framework
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474881/
https://www.ncbi.nlm.nih.gov/pubmed/32953263
http://dx.doi.org/10.7717/peerj.9762
work_keys_str_mv AT benavidesandres datmadistributedautomaticmetagenomicassemblyandannotationframework
AT sanchezfriman datmadistributedautomaticmetagenomicassemblyandannotationframework
AT alzatejuanf datmadistributedautomaticmetagenomicassemblyandannotationframework
AT cabarcasfelipe datmadistributedautomaticmetagenomicassemblyandannotationframework