Cargando…

DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data

BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources...

Descripción completa

Detalles Bibliográficos
Autores principales: Arango-Argoty, Gustavo, Garner, Emily, Pruden, Amy, Heath, Lenwood S., Vikesland, Peter, Zhang, Liqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796597/
https://www.ncbi.nlm.nih.gov/pubmed/29391044
http://dx.doi.org/10.1186/s40168-018-0401-z
_version_ 1783297530729070592
author Arango-Argoty, Gustavo
Garner, Emily
Pruden, Amy
Heath, Lenwood S.
Vikesland, Peter
Zhang, Liqing
author_facet Arango-Argoty, Gustavo
Garner, Emily
Pruden, Amy
Heath, Lenwood S.
Vikesland, Peter
Zhang, Liqing
author_sort Arango-Argoty, Gustavo
collection PubMed
description BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the “best hits” of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. RESULTS: Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models’ performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. CONCLUSIONS: The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0401-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5796597
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57965972018-02-12 DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data Arango-Argoty, Gustavo Garner, Emily Pruden, Amy Heath, Lenwood S. Vikesland, Peter Zhang, Liqing Microbiome Software BACKGROUND: Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the “best hits” of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. RESULTS: Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models’ performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. CONCLUSIONS: The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0401-z) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-01 /pmc/articles/PMC5796597/ /pubmed/29391044 http://dx.doi.org/10.1186/s40168-018-0401-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Arango-Argoty, Gustavo
Garner, Emily
Pruden, Amy
Heath, Lenwood S.
Vikesland, Peter
Zhang, Liqing
DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title_full DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title_fullStr DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title_full_unstemmed DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title_short DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
title_sort deeparg: a deep learning approach for predicting antibiotic resistance genes from metagenomic data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796597/
https://www.ncbi.nlm.nih.gov/pubmed/29391044
http://dx.doi.org/10.1186/s40168-018-0401-z
work_keys_str_mv AT arangoargotygustavo deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata
AT garneremily deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata
AT prudenamy deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata
AT heathlenwoods deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata
AT vikeslandpeter deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata
AT zhangliqing deepargadeeplearningapproachforpredictingantibioticresistancegenesfrommetagenomicdata