Cargando…

Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression

Machine learning (ML) is a useful tool for advancing our understanding of the patterns and significance of biomedical data. Given the growing trend on the application of ML techniques in precision medicine, here we present an ML technique which predicts the likelihood of complete remission (CR) in p...

Descripción completa

Detalles Bibliográficos
Autores principales: Gal, Ophir, Auslander, Noam, Fan, Yu, Meerzaman, Daoud
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423478/
https://www.ncbi.nlm.nih.gov/pubmed/30911218
http://dx.doi.org/10.1177/1176935119835544
_version_ 1783404538144751616
author Gal, Ophir
Auslander, Noam
Fan, Yu
Meerzaman, Daoud
author_facet Gal, Ophir
Auslander, Noam
Fan, Yu
Meerzaman, Daoud
author_sort Gal, Ophir
collection PubMed
description Machine learning (ML) is a useful tool for advancing our understanding of the patterns and significance of biomedical data. Given the growing trend on the application of ML techniques in precision medicine, here we present an ML technique which predicts the likelihood of complete remission (CR) in patients diagnosed with acute myeloid leukemia (AML). In this study, we explored the question of whether ML algorithms designed to analyze gene-expression patterns obtained through RNA sequencing (RNA-seq) can be used to accurately predict the likelihood of CR in pediatric AML patients who have received induction therapy. We employed tests of statistical significance to determine which genes were differentially expressed in the samples derived from patients who achieved CR after 2 courses of treatment and the samples taken from patients who did not benefit. We tuned classifier hyperparameters to optimize performance and used multiple methods to guide our feature selection as well as our assessment of algorithm performance. To identify the model which performed best within the context of this study, we plotted receiver operating characteristic (ROC) curves. Using the top 75 genes from the k-nearest neighbors algorithm (K-NN) model (K = 27) yielded the best area-under-the-curve (AUC) score that we obtained: 0.84. When we finally tested the previously unseen test data set, the top 50 genes yielded the best AUC = 0.81. Pathway enrichment analysis for these 50 genes showed that the guanosine diphosphate fucose (GDP-fucose) biosynthesis pathway is the most significant with an adjusted P value = .0092, which may suggest the vital role of N-glycosylation in AML.
format Online
Article
Text
id pubmed-6423478
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-64234782019-03-25 Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression Gal, Ophir Auslander, Noam Fan, Yu Meerzaman, Daoud Cancer Inform Short Report Machine learning (ML) is a useful tool for advancing our understanding of the patterns and significance of biomedical data. Given the growing trend on the application of ML techniques in precision medicine, here we present an ML technique which predicts the likelihood of complete remission (CR) in patients diagnosed with acute myeloid leukemia (AML). In this study, we explored the question of whether ML algorithms designed to analyze gene-expression patterns obtained through RNA sequencing (RNA-seq) can be used to accurately predict the likelihood of CR in pediatric AML patients who have received induction therapy. We employed tests of statistical significance to determine which genes were differentially expressed in the samples derived from patients who achieved CR after 2 courses of treatment and the samples taken from patients who did not benefit. We tuned classifier hyperparameters to optimize performance and used multiple methods to guide our feature selection as well as our assessment of algorithm performance. To identify the model which performed best within the context of this study, we plotted receiver operating characteristic (ROC) curves. Using the top 75 genes from the k-nearest neighbors algorithm (K-NN) model (K = 27) yielded the best area-under-the-curve (AUC) score that we obtained: 0.84. When we finally tested the previously unseen test data set, the top 50 genes yielded the best AUC = 0.81. Pathway enrichment analysis for these 50 genes showed that the guanosine diphosphate fucose (GDP-fucose) biosynthesis pathway is the most significant with an adjusted P value = .0092, which may suggest the vital role of N-glycosylation in AML. SAGE Publications 2019-03-15 /pmc/articles/PMC6423478/ /pubmed/30911218 http://dx.doi.org/10.1177/1176935119835544 Text en © The Author(s) 2019 http://www.creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Short Report
Gal, Ophir
Auslander, Noam
Fan, Yu
Meerzaman, Daoud
Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title_full Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title_fullStr Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title_full_unstemmed Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title_short Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression
title_sort predicting complete remission of acute myeloid leukemia: machine learning applied to gene expression
topic Short Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423478/
https://www.ncbi.nlm.nih.gov/pubmed/30911218
http://dx.doi.org/10.1177/1176935119835544
work_keys_str_mv AT galophir predictingcompleteremissionofacutemyeloidleukemiamachinelearningappliedtogeneexpression
AT auslandernoam predictingcompleteremissionofacutemyeloidleukemiamachinelearningappliedtogeneexpression
AT fanyu predictingcompleteremissionofacutemyeloidleukemiamachinelearningappliedtogeneexpression
AT meerzamandaoud predictingcompleteremissionofacutemyeloidleukemiamachinelearningappliedtogeneexpression