Cargando…

Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection

BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate cl...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Jinhong, Yan, Yan, Links, Matthew G., Li, Longhai, Dillon, Jo-Anne R., Horsch, Michael, Kusalik, Anthony
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929425/
https://www.ncbi.nlm.nih.gov/pubmed/31874612
http://dx.doi.org/10.1186/s12859-019-3054-4
_version_ 1783482697242378240
author Shi, Jinhong
Yan, Yan
Links, Matthew G.
Li, Longhai
Dillon, Jo-Anne R.
Horsch, Michael
Kusalik, Anthony
author_facet Shi, Jinhong
Yan, Yan
Links, Matthew G.
Li, Longhai
Dillon, Jo-Anne R.
Horsch, Michael
Kusalik, Anthony
author_sort Shi, Jinhong
collection PubMed
description BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. RESULTS: The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. CONCLUSIONS: DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3054-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6929425
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69294252019-12-30 Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection Shi, Jinhong Yan, Yan Links, Matthew G. Li, Longhai Dillon, Jo-Anne R. Horsch, Michael Kusalik, Anthony BMC Bioinformatics Methodology BACKGROUND: Antimicrobial resistance (AMR) is a major threat to global public health because it makes standard treatments ineffective and contributes to the spread of infections. It is important to understand AMR’s biological mechanisms for the development of new drugs and more rapid and accurate clinical diagnostics. The increasing availability of whole-genome SNP (single nucleotide polymorphism) information, obtained from whole-genome sequence data, along with AMR profiles provides an opportunity to use feature selection in machine learning to find AMR-associated mutations. This work describes the use of a supervised feature selection approach using deep neural networks to detect AMR-associated genetic factors from whole-genome SNP data. RESULTS: The proposed method, DNP-AAP (deep neural pursuit – average activation potential), was tested on a Neisseria gonorrhoeae dataset with paired whole-genome sequence data and resistance profiles to five commonly used antibiotics including penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime. The results show that DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae, and also provide a list of candidate genomic features (SNPs) that might lead to the discovery of novel AMR determinants. Logistic regression classifiers were built with the identified SNPs and the prediction AUCs (area under the curve) for penicillin, tetracycline, azithromycin, ciprofloxacin, and cefixime were 0.974, 0.969, 0.949, 0.994, and 0.976, respectively. CONCLUSIONS: DNP-AAP can effectively identify known AMR-associated genes in N. gonorrhoeae. It also provides a list of candidate genes and intergenic regions that might lead to novel AMR factor discovery. More generally, DNP-AAP can be applied to AMR analysis of any bacterial species with genomic variants and phenotype data. It can serve as a useful screening tool for microbiologists to generate genetic candidates for further lab experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3054-4) contains supplementary material, which is available to authorized users. BioMed Central 2019-12-24 /pmc/articles/PMC6929425/ /pubmed/31874612 http://dx.doi.org/10.1186/s12859-019-3054-4 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Shi, Jinhong
Yan, Yan
Links, Matthew G.
Li, Longhai
Dillon, Jo-Anne R.
Horsch, Michael
Kusalik, Anthony
Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title_full Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title_fullStr Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title_full_unstemmed Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title_short Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
title_sort antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929425/
https://www.ncbi.nlm.nih.gov/pubmed/31874612
http://dx.doi.org/10.1186/s12859-019-3054-4
work_keys_str_mv AT shijinhong antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT yanyan antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT linksmatthewg antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT lilonghai antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT dillonjoanner antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT horschmichael antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection
AT kusalikanthony antimicrobialresistancegeneticfactoridentificationfromwholegenomesequencedatausingdeepfeatureselection