Cargando…
Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection
BACKGROUND: The development of accurate classification models depends upon the methods used to identify the most relevant variables. The aim of this article is to evaluate variable selection methods to identify important variables in predicting a binary response using nonlinear statistical models. O...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4132894/ https://www.ncbi.nlm.nih.gov/pubmed/25132738 http://dx.doi.org/10.4172/jpb.1000308 |
_version_ | 1782330680375508992 |
---|---|
author | Ju, Hyunsu Brasier, Allan R Kurosky, Alexander Xu, Bo Reyes, Victor E Graham, David Y |
author_facet | Ju, Hyunsu Brasier, Allan R Kurosky, Alexander Xu, Bo Reyes, Victor E Graham, David Y |
author_sort | Ju, Hyunsu |
collection | PubMed |
description | BACKGROUND: The development of accurate classification models depends upon the methods used to identify the most relevant variables. The aim of this article is to evaluate variable selection methods to identify important variables in predicting a binary response using nonlinear statistical models. Our goals in model selection include producing non-overfitting stable models that are interpretable, that generate accurate predictions and have minimum bias. This work was motivated by data on clinical and laboratory features of Helicobacter pylori infections obtained from 60 individuals enrolled in a prospective observational study. RESULTS: We carried out a comprehensive performance comparison of several nonlinear classification models over the H. pylori data set. We compared variable selection results by Multivariate Adaptive Regression Splines (MARS), Logistic Regression with regularization, Generalized Additive Models (GAMs) and Bayesian Variable Selection in GAMs. We found that the MARS model approach has the highest predictive power because the nonlinearity assumptions of candidate predictors are strongly satisfied, a finding demonstrated via deviance chi-square testing procedures in GAMs. CONCLUSIONS: Our results suggest that the physiological free amino acids citrulline, histidine, lysine and arginine are the major features for predicting H. pylori peptic ulcer disease on the basis of amino acid profiling. |
format | Online Article Text |
id | pubmed-4132894 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
record_format | MEDLINE/PubMed |
spelling | pubmed-41328942014-08-14 Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection Ju, Hyunsu Brasier, Allan R Kurosky, Alexander Xu, Bo Reyes, Victor E Graham, David Y J Proteomics Bioinform Article BACKGROUND: The development of accurate classification models depends upon the methods used to identify the most relevant variables. The aim of this article is to evaluate variable selection methods to identify important variables in predicting a binary response using nonlinear statistical models. Our goals in model selection include producing non-overfitting stable models that are interpretable, that generate accurate predictions and have minimum bias. This work was motivated by data on clinical and laboratory features of Helicobacter pylori infections obtained from 60 individuals enrolled in a prospective observational study. RESULTS: We carried out a comprehensive performance comparison of several nonlinear classification models over the H. pylori data set. We compared variable selection results by Multivariate Adaptive Regression Splines (MARS), Logistic Regression with regularization, Generalized Additive Models (GAMs) and Bayesian Variable Selection in GAMs. We found that the MARS model approach has the highest predictive power because the nonlinearity assumptions of candidate predictors are strongly satisfied, a finding demonstrated via deviance chi-square testing procedures in GAMs. CONCLUSIONS: Our results suggest that the physiological free amino acids citrulline, histidine, lysine and arginine are the major features for predicting H. pylori peptic ulcer disease on the basis of amino acid profiling. 2014-03-28 2014-04-01 /pmc/articles/PMC4132894/ /pubmed/25132738 http://dx.doi.org/10.4172/jpb.1000308 Text en Copyright: © 2014 Ju H, et al. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Article Ju, Hyunsu Brasier, Allan R Kurosky, Alexander Xu, Bo Reyes, Victor E Graham, David Y Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title | Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title_full | Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title_fullStr | Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title_full_unstemmed | Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title_short | Diagnostics for Statistical Variable Selection Methods for Prediction of Peptic Ulcer Disease in Helicobacter pylori Infection |
title_sort | diagnostics for statistical variable selection methods for prediction of peptic ulcer disease in helicobacter pylori infection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4132894/ https://www.ncbi.nlm.nih.gov/pubmed/25132738 http://dx.doi.org/10.4172/jpb.1000308 |
work_keys_str_mv | AT juhyunsu diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection AT brasierallanr diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection AT kuroskyalexander diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection AT xubo diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection AT reyesvictore diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection AT grahamdavidy diagnosticsforstatisticalvariableselectionmethodsforpredictionofpepticulcerdiseaseinhelicobacterpyloriinfection |