Cargando…
An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic v...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575050/ https://www.ncbi.nlm.nih.gov/pubmed/34414415 http://dx.doi.org/10.1093/bib/bbab299 |
_version_ | 1784595608070258688 |
---|---|
author | Yang, Yang Walker, Timothy M Kouchaki, Samaneh Wang, Chenyang Peto, Timothy E A Crook, Derrick W Clifton, David A |
author_facet | Yang, Yang Walker, Timothy M Kouchaki, Samaneh Wang, Chenyang Peto, Timothy E A Crook, Derrick W Clifton, David A |
author_sort | Yang, Yang |
collection | PubMed |
description | Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic variations to be homogeneous, thus not distinguishing between coding and intergenic sequences. In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method—heterogeneous graph attention network (‘HGAT–AMR’)—to predict anti-tuberculosis (TB) drug resistance. The HGAT–AMR model is able to accommodate incomplete phenotypic profiles, as well as provide ‘attention scores’ of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is ‘paying attention to’ in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge. |
format | Online Article Text |
id | pubmed-8575050 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-85750502021-11-09 An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction Yang, Yang Walker, Timothy M Kouchaki, Samaneh Wang, Chenyang Peto, Timothy E A Crook, Derrick W Clifton, David A Brief Bioinform Problem Solving Protocol Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic variations to be homogeneous, thus not distinguishing between coding and intergenic sequences. In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method—heterogeneous graph attention network (‘HGAT–AMR’)—to predict anti-tuberculosis (TB) drug resistance. The HGAT–AMR model is able to accommodate incomplete phenotypic profiles, as well as provide ‘attention scores’ of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is ‘paying attention to’ in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge. Oxford University Press 2021-08-20 /pmc/articles/PMC8575050/ /pubmed/34414415 http://dx.doi.org/10.1093/bib/bbab299 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Problem Solving Protocol Yang, Yang Walker, Timothy M Kouchaki, Samaneh Wang, Chenyang Peto, Timothy E A Crook, Derrick W Clifton, David A An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title | An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title_full | An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title_fullStr | An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title_full_unstemmed | An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title_short | An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction |
title_sort | end-to-end heterogeneous graph attention network for mycobacterium tuberculosis drug-resistance prediction |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575050/ https://www.ncbi.nlm.nih.gov/pubmed/34414415 http://dx.doi.org/10.1093/bib/bbab299 |
work_keys_str_mv | AT yangyang anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT walkertimothym anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT kouchakisamaneh anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT wangchenyang anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT petotimothyea anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT crookderrickw anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT cliftondavida anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT yangyang endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT walkertimothym endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT kouchakisamaneh endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT wangchenyang endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT petotimothyea endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT crookderrickw endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction AT cliftondavida endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction |