Cargando…

Diabetes classification model based on boosting algorithms

BACKGROUND: Diabetes mellitus is a common and complicated chronic lifelong disease. Hence, it is of high clinical significance to find the most relevant clinical indexes and to perform efficient computer-aided pre-diagnoses and diagnoses. RESULTS: Non-parametric statistical testing is performed on h...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Peihua, Pan, Chuandi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5872396/ https://www.ncbi.nlm.nih.gov/pubmed/29587624 http://dx.doi.org/10.1186/s12859-018-2090-9

_version_	1783309827497263104
author	Chen, Peihua Pan, Chuandi
author_facet	Chen, Peihua Pan, Chuandi
author_sort	Chen, Peihua
collection	PubMed
description	BACKGROUND: Diabetes mellitus is a common and complicated chronic lifelong disease. Hence, it is of high clinical significance to find the most relevant clinical indexes and to perform efficient computer-aided pre-diagnoses and diagnoses. RESULTS: Non-parametric statistical testing is performed on hundreds of medical measurement index results between diabetic and non-diabetic populations. Two common boosting algorithms, Adaboost.M1 and LogitBoost, are selected to establish a machine model for diabetes diagnosis based on these clinical test data, involving a total of 35,669 individuals. The machine classification models built by these two algorithms have very good classification ability. Here, the LogitBoost classification model is slightly better than the Adaboost.M1 classification model. The overall accuracy of the LogitBoost classification model reached 95.30% when using 10-fold cross validation. The true positive, true negative, false positive, and false negative rates of the binary classification model were 0.921, 0.969, 0.031, and 0.079, respectively, and the area under the receiver operating characteristic curve reached 0.99. CONCLUSIONS: The boosting algorithms show excellent performance for the diabetes classification models based on clinical medical data. The coefficient matrix of the original data is a sparse matrix, because some of the test results were missing, including some that were directly related to disease diagnosis. Therefore, the model is robust and has a degree of pre-diagnosis function. In the process of selecting the preferred test items, the most statistically significant discriminating factors between the diabetic and general populations were obtained and can be used as reference risk factors for diabetes mellitus.
format	Online Article Text
id	pubmed-5872396
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-58723962018-04-02 Diabetes classification model based on boosting algorithms Chen, Peihua Pan, Chuandi BMC Bioinformatics Research BACKGROUND: Diabetes mellitus is a common and complicated chronic lifelong disease. Hence, it is of high clinical significance to find the most relevant clinical indexes and to perform efficient computer-aided pre-diagnoses and diagnoses. RESULTS: Non-parametric statistical testing is performed on hundreds of medical measurement index results between diabetic and non-diabetic populations. Two common boosting algorithms, Adaboost.M1 and LogitBoost, are selected to establish a machine model for diabetes diagnosis based on these clinical test data, involving a total of 35,669 individuals. The machine classification models built by these two algorithms have very good classification ability. Here, the LogitBoost classification model is slightly better than the Adaboost.M1 classification model. The overall accuracy of the LogitBoost classification model reached 95.30% when using 10-fold cross validation. The true positive, true negative, false positive, and false negative rates of the binary classification model were 0.921, 0.969, 0.031, and 0.079, respectively, and the area under the receiver operating characteristic curve reached 0.99. CONCLUSIONS: The boosting algorithms show excellent performance for the diabetes classification models based on clinical medical data. The coefficient matrix of the original data is a sparse matrix, because some of the test results were missing, including some that were directly related to disease diagnosis. Therefore, the model is robust and has a degree of pre-diagnosis function. In the process of selecting the preferred test items, the most statistically significant discriminating factors between the diabetic and general populations were obtained and can be used as reference risk factors for diabetes mellitus. BioMed Central 2018-03-27 /pmc/articles/PMC5872396/ /pubmed/29587624 http://dx.doi.org/10.1186/s12859-018-2090-9 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Chen, Peihua Pan, Chuandi Diabetes classification model based on boosting algorithms
title	Diabetes classification model based on boosting algorithms
title_full	Diabetes classification model based on boosting algorithms
title_fullStr	Diabetes classification model based on boosting algorithms
title_full_unstemmed	Diabetes classification model based on boosting algorithms
title_short	Diabetes classification model based on boosting algorithms
title_sort	diabetes classification model based on boosting algorithms
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5872396/ https://www.ncbi.nlm.nih.gov/pubmed/29587624 http://dx.doi.org/10.1186/s12859-018-2090-9
work_keys_str_mv	AT chenpeihua diabetesclassificationmodelbasedonboostingalgorithms AT panchuandi diabetesclassificationmodelbasedonboostingalgorithms

Diabetes classification model based on boosting algorithms

Ejemplares similares