Cargando…

Leukemia Prediction Using Sparse Logistic Regression

We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML) from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patie...

Descripción completa

Detalles Bibliográficos
Autores principales: Manninen, Tapio, Huttunen, Heikki, Ruusuvuori, Pekka, Nykter, Matti
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3758279/
https://www.ncbi.nlm.nih.gov/pubmed/24023658
http://dx.doi.org/10.1371/journal.pone.0072932
_version_ 1782282331104477184
author Manninen, Tapio
Huttunen, Heikki
Ruusuvuori, Pekka
Nykter, Matti
author_facet Manninen, Tapio
Huttunen, Heikki
Ruusuvuori, Pekka
Nykter, Matti
author_sort Manninen, Tapio
collection PubMed
description We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML) from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Image: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.
format Online
Article
Text
id pubmed-3758279
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-37582792013-09-10 Leukemia Prediction Using Sparse Logistic Regression Manninen, Tapio Huttunen, Heikki Ruusuvuori, Pekka Nykter, Matti PLoS One Research Article We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML) from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Image: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical. Public Library of Science 2013-08-30 /pmc/articles/PMC3758279/ /pubmed/24023658 http://dx.doi.org/10.1371/journal.pone.0072932 Text en © 2013 Manninen et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Manninen, Tapio
Huttunen, Heikki
Ruusuvuori, Pekka
Nykter, Matti
Leukemia Prediction Using Sparse Logistic Regression
title Leukemia Prediction Using Sparse Logistic Regression
title_full Leukemia Prediction Using Sparse Logistic Regression
title_fullStr Leukemia Prediction Using Sparse Logistic Regression
title_full_unstemmed Leukemia Prediction Using Sparse Logistic Regression
title_short Leukemia Prediction Using Sparse Logistic Regression
title_sort leukemia prediction using sparse logistic regression
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3758279/
https://www.ncbi.nlm.nih.gov/pubmed/24023658
http://dx.doi.org/10.1371/journal.pone.0072932
work_keys_str_mv AT manninentapio leukemiapredictionusingsparselogisticregression
AT huttunenheikki leukemiapredictionusingsparselogisticregression
AT ruusuvuoripekka leukemiapredictionusingsparselogisticregression
AT nyktermatti leukemiapredictionusingsparselogisticregression