Cargando…

Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes

BACKGROUND: We present a potentially useful alternative approach based on support vector machine (SVM) techniques to classify persons with and without common diseases. We illustrate the method to detect persons with diabetes and pre-diabetes in a cross-sectional representative sample of the U.S. pop...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Wei, Liu, Tiebin, Valdez, Rodolfo, Gwinn, Marta, Khoury, Muin J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2850872/
https://www.ncbi.nlm.nih.gov/pubmed/20307319
http://dx.doi.org/10.1186/1472-6947-10-16
_version_ 1782179810632531968
author Yu, Wei
Liu, Tiebin
Valdez, Rodolfo
Gwinn, Marta
Khoury, Muin J
author_facet Yu, Wei
Liu, Tiebin
Valdez, Rodolfo
Gwinn, Marta
Khoury, Muin J
author_sort Yu, Wei
collection PubMed
description BACKGROUND: We present a potentially useful alternative approach based on support vector machine (SVM) techniques to classify persons with and without common diseases. We illustrate the method to detect persons with diabetes and pre-diabetes in a cross-sectional representative sample of the U.S. population. METHODS: We used data from the 1999-2004 National Health and Nutrition Examination Survey (NHANES) to develop and validate SVM models for two classification schemes: Classification Scheme I (diagnosed or undiagnosed diabetes vs. pre-diabetes or no diabetes) and Classification Scheme II (undiagnosed diabetes or pre-diabetes vs. no diabetes). The SVM models were used to select sets of variables that would yield the best classification of individuals into these diabetes categories. RESULTS: For Classification Scheme I, the set of diabetes-related variables with the best classification performance included family history, age, race and ethnicity, weight, height, waist circumference, body mass index (BMI), and hypertension. For Classification Scheme II, two additional variables--sex and physical activity--were included. The discriminative abilities of the SVM models for Classification Schemes I and II, according to the area under the receiver operating characteristic (ROC) curve, were 83.5% and 73.2%, respectively. The web-based tool-Diabetes Classifier was developed to demonstrate a user-friendly application that allows for individual or group assessment with a configurable, user-defined threshold. CONCLUSIONS: Support vector machine modeling is a promising classification approach for detecting persons with common diseases such as diabetes and pre-diabetes in the population. This approach should be further explored in other complex diseases using common variables.
format Text
id pubmed-2850872
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28508722010-04-08 Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes Yu, Wei Liu, Tiebin Valdez, Rodolfo Gwinn, Marta Khoury, Muin J BMC Med Inform Decis Mak Research Article BACKGROUND: We present a potentially useful alternative approach based on support vector machine (SVM) techniques to classify persons with and without common diseases. We illustrate the method to detect persons with diabetes and pre-diabetes in a cross-sectional representative sample of the U.S. population. METHODS: We used data from the 1999-2004 National Health and Nutrition Examination Survey (NHANES) to develop and validate SVM models for two classification schemes: Classification Scheme I (diagnosed or undiagnosed diabetes vs. pre-diabetes or no diabetes) and Classification Scheme II (undiagnosed diabetes or pre-diabetes vs. no diabetes). The SVM models were used to select sets of variables that would yield the best classification of individuals into these diabetes categories. RESULTS: For Classification Scheme I, the set of diabetes-related variables with the best classification performance included family history, age, race and ethnicity, weight, height, waist circumference, body mass index (BMI), and hypertension. For Classification Scheme II, two additional variables--sex and physical activity--were included. The discriminative abilities of the SVM models for Classification Schemes I and II, according to the area under the receiver operating characteristic (ROC) curve, were 83.5% and 73.2%, respectively. The web-based tool-Diabetes Classifier was developed to demonstrate a user-friendly application that allows for individual or group assessment with a configurable, user-defined threshold. CONCLUSIONS: Support vector machine modeling is a promising classification approach for detecting persons with common diseases such as diabetes and pre-diabetes in the population. This approach should be further explored in other complex diseases using common variables. BioMed Central 2010-03-22 /pmc/articles/PMC2850872/ /pubmed/20307319 http://dx.doi.org/10.1186/1472-6947-10-16 Text en Copyright ©2010 Yu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yu, Wei
Liu, Tiebin
Valdez, Rodolfo
Gwinn, Marta
Khoury, Muin J
Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title_full Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title_fullStr Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title_full_unstemmed Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title_short Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
title_sort application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2850872/
https://www.ncbi.nlm.nih.gov/pubmed/20307319
http://dx.doi.org/10.1186/1472-6947-10-16
work_keys_str_mv AT yuwei applicationofsupportvectormachinemodelingforpredictionofcommondiseasesthecaseofdiabetesandprediabetes
AT liutiebin applicationofsupportvectormachinemodelingforpredictionofcommondiseasesthecaseofdiabetesandprediabetes
AT valdezrodolfo applicationofsupportvectormachinemodelingforpredictionofcommondiseasesthecaseofdiabetesandprediabetes
AT gwinnmarta applicationofsupportvectormachinemodelingforpredictionofcommondiseasesthecaseofdiabetesandprediabetes
AT khourymuinj applicationofsupportvectormachinemodelingforpredictionofcommondiseasesthecaseofdiabetesandprediabetes