Cargando…
Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate cl...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929425/ https://www.ncbi.nlm.nih.gov/pubmed/31874612 http://dx.doi.org/10.1186/s12859-019-3054-4 |
_version_ | 1783482697242378240 |
---|---|
author | Shi, Jinhong Yan, Yan Links, Matthew G. Li, Longhai Dillon, Jo-Anne R. Horsch, Michael Kusalik, Anthony |
author_facet | Shi, Jinhong Yan, Yan Links, Matthew G. Li, Longhai Dillon, Jo-Anne R. Horsch, Michael Kusalik, Anthony |
author_sort | Shi, Jinhong |
collection | PubMed |
description | BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. RESULTS: The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. CONCLUSIONS: DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3054-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6929425 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69294252019-12-30 Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection Shi, Jinhong Yan, Yan Links, Matthew G. Li, Longhai Dillon, Jo-Anne R. Horsch, Michael Kusalik, Anthony BMC Bioinformatics Methodology BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. RESULTS: The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. CONCLUSIONS: DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3054-4) contains supplementary material, which is available to authorized users. BioMed Central 2019-12-24 /pmc/articles/PMC6929425/ /pubmed/31874612 http://dx.doi.org/10.1186/s12859-019-3054-4 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Shi, Jinhong Yan, Yan Links, Matthew G. Li, Longhai Dillon, Jo-Anne R. Horsch, Michael Kusalik, Anthony Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title | Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title_full | Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title_fullStr | Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title_full_unstemmed | Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title_short | Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
title_sort | antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929425/ https://www.ncbi.nlm.nih.gov/pubmed/31874612 http://dx.doi.org/10.1186/s12859-019-3054-4 |
work_keys_str_mv | AT shijinhong antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT yanyan antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT linksmatthewg antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT lilonghai antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT dillonjoanner antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT horschmichael antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection AT kusalikanthony antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection |