Cargando…

Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever

BACKGROUND: The choice of selection methods to identify important variables for binary classification modeling is critical to produce stable models that are interpretable, that generate accurate predictions and have minimum bias. This work is motivated by data on clinical and laboratory features of...

Descripción completa

Detalles Bibliográficos
Autores principales: Ju, Hyunsu, Brasier, Allan R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846812/
https://www.ncbi.nlm.nih.gov/pubmed/24025735
http://dx.doi.org/10.1186/1756-0500-6-365
_version_ 1782293491401883648
author Ju, Hyunsu
Brasier, Allan R
author_facet Ju, Hyunsu
Brasier, Allan R
author_sort Ju, Hyunsu
collection PubMed
description BACKGROUND: The choice of selection methods to identify important variables for binary classification modeling is critical to produce stable models that are interpretable, that generate accurate predictions and have minimum bias. This work is motivated by data on clinical and laboratory features of severe dengue infections (dengue hemorrhagic fever, DHF) obtained from 51 individuals enrolled in a prospective observational study of acute human dengue infections. RESULTS: We carry out a comprehensive performance comparison using several classification models for DHF over the dengue data set. We compared variable selection results by Multivariate Adaptive Regression Splines, Learning Ensemble, Random Forest, Bayesian Moving Averaging, Stochastic Search Variable Selection, and Generalized Regularized Logistics Regression. Model averaging methods (bagging, boosting and ensemble learners) have higher accuracy, but the generalized regularized regression model has the highest predictive power because the linearity assumptions of candidate predictors are strongly satisfied via deviance chi-square testing procedures. Bootstrapping applications for evaluating predictive regression coefficients in regularized regression model are performed. CONCLUSIONS: Feature reduction methods introduce inherent biases and therefore are data-type dependent. We propose that these limitations can be overcome using an exhaustive approach for searching feature space. Using this approach, our results suggest that IL-10, platelet and lymphocyte counts are the major features for predicting dengue DHF on the basis of blood chemistries and cytokine measurements.
format Online
Article
Text
id pubmed-3846812
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38468122013-12-07 Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever Ju, Hyunsu Brasier, Allan R BMC Res Notes Research Article BACKGROUND: The choice of selection methods to identify important variables for binary classification modeling is critical to produce stable models that are interpretable, that generate accurate predictions and have minimum bias. This work is motivated by data on clinical and laboratory features of severe dengue infections (dengue hemorrhagic fever, DHF) obtained from 51 individuals enrolled in a prospective observational study of acute human dengue infections. RESULTS: We carry out a comprehensive performance comparison using several classification models for DHF over the dengue data set. We compared variable selection results by Multivariate Adaptive Regression Splines, Learning Ensemble, Random Forest, Bayesian Moving Averaging, Stochastic Search Variable Selection, and Generalized Regularized Logistics Regression. Model averaging methods (bagging, boosting and ensemble learners) have higher accuracy, but the generalized regularized regression model has the highest predictive power because the linearity assumptions of candidate predictors are strongly satisfied via deviance chi-square testing procedures. Bootstrapping applications for evaluating predictive regression coefficients in regularized regression model are performed. CONCLUSIONS: Feature reduction methods introduce inherent biases and therefore are data-type dependent. We propose that these limitations can be overcome using an exhaustive approach for searching feature space. Using this approach, our results suggest that IL-10, platelet and lymphocyte counts are the major features for predicting dengue DHF on the basis of blood chemistries and cytokine measurements. BioMed Central 2013-09-11 /pmc/articles/PMC3846812/ /pubmed/24025735 http://dx.doi.org/10.1186/1756-0500-6-365 Text en Copyright © 2013 Ju and Brasier; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ju, Hyunsu
Brasier, Allan R
Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title_full Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title_fullStr Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title_full_unstemmed Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title_short Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
title_sort variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846812/
https://www.ncbi.nlm.nih.gov/pubmed/24025735
http://dx.doi.org/10.1186/1756-0500-6-365
work_keys_str_mv AT juhyunsu variableselectionmethodsfordevelopingabiomarkerpanelforpredictionofdenguehemorrhagicfever
AT brasierallanr variableselectionmethodsfordevelopingabiomarkerpanelforpredictionofdenguehemorrhagicfever