Cargando…

An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction

Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic v...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yang, Walker, Timothy M, Kouchaki, Samaneh, Wang, Chenyang, Peto, Timothy E A, Crook, Derrick W, Clifton, David A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575050/
https://www.ncbi.nlm.nih.gov/pubmed/34414415
http://dx.doi.org/10.1093/bib/bbab299
_version_ 1784595608070258688
author Yang, Yang
Walker, Timothy M
Kouchaki, Samaneh
Wang, Chenyang
Peto, Timothy E A
Crook, Derrick W
Clifton, David A
author_facet Yang, Yang
Walker, Timothy M
Kouchaki, Samaneh
Wang, Chenyang
Peto, Timothy E A
Crook, Derrick W
Clifton, David A
author_sort Yang, Yang
collection PubMed
description Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic variations to be homogeneous, thus not distinguishing between coding and intergenic sequences. In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method—heterogeneous graph attention network (‘HGAT–AMR’)—to predict anti-tuberculosis (TB) drug resistance. The HGAT–AMR model is able to accommodate incomplete phenotypic profiles, as well as provide ‘attention scores’ of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is ‘paying attention to’ in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge.
format Online
Article
Text
id pubmed-8575050
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85750502021-11-09 An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction Yang, Yang Walker, Timothy M Kouchaki, Samaneh Wang, Chenyang Peto, Timothy E A Crook, Derrick W Clifton, David A Brief Bioinform Problem Solving Protocol Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic variations to be homogeneous, thus not distinguishing between coding and intergenic sequences. In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method—heterogeneous graph attention network (‘HGAT–AMR’)—to predict anti-tuberculosis (TB) drug resistance. The HGAT–AMR model is able to accommodate incomplete phenotypic profiles, as well as provide ‘attention scores’ of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is ‘paying attention to’ in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge. Oxford University Press 2021-08-20 /pmc/articles/PMC8575050/ /pubmed/34414415 http://dx.doi.org/10.1093/bib/bbab299 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Yang, Yang
Walker, Timothy M
Kouchaki, Samaneh
Wang, Chenyang
Peto, Timothy E A
Crook, Derrick W
Clifton, David A
An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title_full An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title_fullStr An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title_full_unstemmed An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title_short An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
title_sort end-to-end heterogeneous graph attention network for mycobacterium tuberculosis drug-resistance prediction
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575050/
https://www.ncbi.nlm.nih.gov/pubmed/34414415
http://dx.doi.org/10.1093/bib/bbab299
work_keys_str_mv AT yangyang anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT walkertimothym anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT kouchakisamaneh anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT wangchenyang anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT petotimothyea anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT crookderrickw anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT cliftondavida anendtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT yangyang endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT walkertimothym endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT kouchakisamaneh endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT wangchenyang endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT petotimothyea endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT crookderrickw endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction
AT cliftondavida endtoendheterogeneousgraphattentionnetworkformycobacteriumtuberculosisdrugresistanceprediction