Cargando…
DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796597/ https://www.ncbi.nlm.nih.gov/pubmed/29391044 http://dx.doi.org/10.1186/s40168-018-0401-z |
_version_ | 1783297530729070592 |
---|---|
author | Arango-Argoty, Gustavo Garner, Emily Pruden, Amy Heath, Lenwood S. Vikesland, Peter Zhang, Liqing |
author_facet | Arango-Argoty, Gustavo Garner, Emily Pruden, Amy Heath, Lenwood S. Vikesland, Peter Zhang, Liqing |
author_sort | Arango-Argoty, Gustavo |
collection | PubMed |
description | BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the “best hits” of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. RESULTS: Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models’ performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. CONCLUSIONS: The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0401-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5796597 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57965972018-02-12 DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data Arango-Argoty, Gustavo Garner, Emily Pruden, Amy Heath, Lenwood S. Vikesland, Peter Zhang, Liqing Microbiome Software BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the “best hits” of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. RESULTS: Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models’ performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. CONCLUSIONS: The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0401-z) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-01 /pmc/articles/PMC5796597/ /pubmed/29391044 http://dx.doi.org/10.1186/s40168-018-0401-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Arango-Argoty, Gustavo Garner, Emily Pruden, Amy Heath, Lenwood S. Vikesland, Peter Zhang, Liqing DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title | DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title_full | DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title_fullStr | DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title_full_unstemmed | DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title_short | DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
title_sort | deeparg: a deep learning approach for predicting antibiotic resistance genes from metagenomic data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796597/ https://www.ncbi.nlm.nih.gov/pubmed/29391044 http://dx.doi.org/10.1186/s40168-018-0401-z |
work_keys_str_mv | AT arangoargotygustavo deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata AT garneremily deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata AT prudenamy deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata AT heathlenwoods deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata AT vikeslandpeter deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata AT zhangliqing deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata |