Cargando…
DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework
BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads fr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474881/ https://www.ncbi.nlm.nih.gov/pubmed/32953263 http://dx.doi.org/10.7717/peerj.9762 |
_version_ | 1783579408695558144 |
---|---|
author | Benavides, Andres Sanchez, Friman Alzate, Juan F. Cabarcas, Felipe |
author_facet | Benavides, Andres Sanchez, Friman Alzate, Juan F. Cabarcas, Felipe |
author_sort | Benavides, Andres |
collection | PubMed |
description | BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. METHOD: DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. RESULTS: We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. CONCLUSIONS: DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA. |
format | Online Article Text |
id | pubmed-7474881 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-74748812020-09-18 DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework Benavides, Andres Sanchez, Friman Alzate, Juan F. Cabarcas, Felipe PeerJ Bioinformatics BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read’s quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. METHOD: DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. RESULTS: We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. CONCLUSIONS: DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA. PeerJ Inc. 2020-09-03 /pmc/articles/PMC7474881/ /pubmed/32953263 http://dx.doi.org/10.7717/peerj.9762 Text en ©2020 Benavides et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Benavides, Andres Sanchez, Friman Alzate, Juan F. Cabarcas, Felipe DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title | DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title_full | DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title_fullStr | DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title_full_unstemmed | DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title_short | DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework |
title_sort | datma: distributed automatic metagenomic assembly and annotation framework |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474881/ https://www.ncbi.nlm.nih.gov/pubmed/32953263 http://dx.doi.org/10.7717/peerj.9762 |
work_keys_str_mv | AT benavidesandres datmadistributedautomaticmetagenomicassemblyandannotationframework AT sanchezfriman datmadistributedautomaticmetagenomicassemblyandannotationframework AT alzatejuanf datmadistributedautomaticmetagenomicassemblyandannotationframework AT cabarcasfelipe datmadistributedautomaticmetagenomicassemblyandannotationframework |