Cargando…

A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis

BACKGROUND: Drug resistant Mycobacterium tuberculosis is complicating the effective treatment and control of tuberculosis disease (TB). With the adoption of whole genome sequencing as a diagnostic tool, machine learning approaches are being employed to predict M. tuberculosis resistance and identify...

Descripción completa

Detalles Bibliográficos
Autores principales: Deelder, Wouter, Napier, Gary, Campino, Susana, Palla, Luigi, Phelan, Jody, Clark, Taane G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753810/
https://www.ncbi.nlm.nih.gov/pubmed/35016609
http://dx.doi.org/10.1186/s12864-022-08291-4
_version_ 1784632147208830976
author Deelder, Wouter
Napier, Gary
Campino, Susana
Palla, Luigi
Phelan, Jody
Clark, Taane G.
author_facet Deelder, Wouter
Napier, Gary
Campino, Susana
Palla, Luigi
Phelan, Jody
Clark, Taane G.
author_sort Deelder, Wouter
collection PubMed
description BACKGROUND: Drug resistant Mycobacterium tuberculosis is complicating the effective treatment and control of tuberculosis disease (TB). With the adoption of whole genome sequencing as a diagnostic tool, machine learning approaches are being employed to predict M. tuberculosis resistance and identify underlying genetic mutations. However, machine learning approaches can overfit and fail to identify causal mutations if they are applied out of the box and not adapted to the disease-specific context. We introduce a machine learning approach that is customized to the TB setting, which extracts a library of genomic variants re-occurring across individual studies to improve genotypic profiling. RESULTS: We developed a customized decision tree approach, called Treesist-TB, that performs TB drug resistance prediction by extracting and evaluating genomic variants across multiple studies. The application of Treesist-TB to rifampicin (RIF), isoniazid (INH) and ethambutol (EMB) drugs, for which resistance mutations are known, demonstrated a level of predictive accuracy similar to the widely used TB-Profiler tool (Treesist-TB vs. TB-Profiler tool: RIF 97.5% vs. 97.6%; INH 96.8% vs. 96.5%; EMB 96.8% vs. 95.8%). Application of Treesist-TB to less understood second-line drugs of interest, ethionamide (ETH), cycloserine (CYS) and para-aminosalisylic acid (PAS), led to the identification of new variants (52, 6 and 11, respectively), with a high number absent from the TB-Profiler library (45, 4, and 6, respectively). Thereby, Treesist-TB had improved predictive sensitivity (Treesist-TB vs. TB-Profiler tool: PAS 64.3% vs. 38.8%; CYS 45.3% vs. 30.7%; ETH 72.1% vs. 71.1%). CONCLUSION: Our work reinforces the utility of machine learning for drug resistance prediction, while highlighting the need to customize approaches to the disease-specific context. Through applying a modified decision learning approach (Treesist-TB) across a range of anti-TB drugs, we identified plausible resistance-encoding genomic variants with high predictive ability, whilst potentially overcoming the overfitting challenges that can affect standard machine learning applications. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08291-4.
format Online
Article
Text
id pubmed-8753810
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-87538102022-01-12 A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis Deelder, Wouter Napier, Gary Campino, Susana Palla, Luigi Phelan, Jody Clark, Taane G. BMC Genomics Research BACKGROUND: Drug resistant Mycobacterium tuberculosis is complicating the effective treatment and control of tuberculosis disease (TB). With the adoption of whole genome sequencing as a diagnostic tool, machine learning approaches are being employed to predict M. tuberculosis resistance and identify underlying genetic mutations. However, machine learning approaches can overfit and fail to identify causal mutations if they are applied out of the box and not adapted to the disease-specific context. We introduce a machine learning approach that is customized to the TB setting, which extracts a library of genomic variants re-occurring across individual studies to improve genotypic profiling. RESULTS: We developed a customized decision tree approach, called Treesist-TB, that performs TB drug resistance prediction by extracting and evaluating genomic variants across multiple studies. The application of Treesist-TB to rifampicin (RIF), isoniazid (INH) and ethambutol (EMB) drugs, for which resistance mutations are known, demonstrated a level of predictive accuracy similar to the widely used TB-Profiler tool (Treesist-TB vs. TB-Profiler tool: RIF 97.5% vs. 97.6%; INH 96.8% vs. 96.5%; EMB 96.8% vs. 95.8%). Application of Treesist-TB to less understood second-line drugs of interest, ethionamide (ETH), cycloserine (CYS) and para-aminosalisylic acid (PAS), led to the identification of new variants (52, 6 and 11, respectively), with a high number absent from the TB-Profiler library (45, 4, and 6, respectively). Thereby, Treesist-TB had improved predictive sensitivity (Treesist-TB vs. TB-Profiler tool: PAS 64.3% vs. 38.8%; CYS 45.3% vs. 30.7%; ETH 72.1% vs. 71.1%). CONCLUSION: Our work reinforces the utility of machine learning for drug resistance prediction, while highlighting the need to customize approaches to the disease-specific context. Through applying a modified decision learning approach (Treesist-TB) across a range of anti-TB drugs, we identified plausible resistance-encoding genomic variants with high predictive ability, whilst potentially overcoming the overfitting challenges that can affect standard machine learning applications. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08291-4. BioMed Central 2022-01-11 /pmc/articles/PMC8753810/ /pubmed/35016609 http://dx.doi.org/10.1186/s12864-022-08291-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Deelder, Wouter
Napier, Gary
Campino, Susana
Palla, Luigi
Phelan, Jody
Clark, Taane G.
A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title_full A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title_fullStr A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title_full_unstemmed A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title_short A modified decision tree approach to improve the prediction and mutation discovery for drug resistance in Mycobacterium tuberculosis
title_sort modified decision tree approach to improve the prediction and mutation discovery for drug resistance in mycobacterium tuberculosis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753810/
https://www.ncbi.nlm.nih.gov/pubmed/35016609
http://dx.doi.org/10.1186/s12864-022-08291-4
work_keys_str_mv AT deelderwouter amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT napiergary amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT campinosusana amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT pallaluigi amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT phelanjody amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT clarktaaneg amodifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT deelderwouter modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT napiergary modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT campinosusana modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT pallaluigi modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT phelanjody modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis
AT clarktaaneg modifieddecisiontreeapproachtoimprovethepredictionandmutationdiscoveryfordrugresistanceinmycobacteriumtuberculosis