Cargando…

Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation

BACKGROUND: Early diabetes screening can effectively reduce the burden of disease. However, natural population–based screening projects require a large number of resources. With the emergence and development of machine learning, researchers have started to pursue more flexible and efficient methods...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Tianzhou, Zhang, Li, Yi, Liwei, Feng, Huawei, Li, Shimeng, Chen, Haoyu, Zhu, Junfeng, Zhao, Jian, Zeng, Yingyue, Liu, Hongsheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2020
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7333074/ https://www.ncbi.nlm.nih.gov/pubmed/32554386 http://dx.doi.org/10.2196/15431

_version_	1783553672661172224
author	Yang, Tianzhou Zhang, Li Yi, Liwei Feng, Huawei Li, Shimeng Chen, Haoyu Zhu, Junfeng Zhao, Jian Zeng, Yingyue Liu, Hongsheng
author_facet	Yang, Tianzhou Zhang, Li Yi, Liwei Feng, Huawei Li, Shimeng Chen, Haoyu Zhu, Junfeng Zhao, Jian Zeng, Yingyue Liu, Hongsheng
author_sort	Yang, Tianzhou
collection	PubMed
description	BACKGROUND: Early diabetes screening can effectively reduce the burden of disease. However, natural population–based screening projects require a large number of resources. With the emergence and development of machine learning, researchers have started to pursue more flexible and efficient methods to screen or predict type 2 diabetes. OBJECTIVE: The aim of this study was to build prediction models based on the ensemble learning method for diabetes screening to further improve the health status of the population in a noninvasive and inexpensive manner. METHODS: The dataset for building and evaluating the diabetes prediction model was extracted from the National Health and Nutrition Examination Survey from 2011-2016. After data cleaning and feature selection, the dataset was split into a training set (80%, 2011-2014), test set (20%, 2011-2014) and validation set (2015-2016). Three simple machine learning methods (linear discriminant analysis, support vector machine, and random forest) and easy ensemble methods were used to build diabetes prediction models. The performance of the models was evaluated through 5-fold cross-validation and external validation. The Delong test (2-sided) was used to test the performance differences between the models. RESULTS: We selected 8057 observations and 12 attributes from the database. In the 5-fold cross-validation, the three simple methods yielded highly predictive performance models with areas under the curve (AUCs) over 0.800, wherein the ensemble methods significantly outperformed the simple methods. When we evaluated the models in the test set and validation set, the same trends were observed. The ensemble model of linear discriminant analysis yielded the best performance, with an AUC of 0.849, an accuracy of 0.730, a sensitivity of 0.819, and a specificity of 0.709 in the validation set. CONCLUSIONS: This study indicates that efficient screening using machine learning methods with noninvasive tests can be applied to a large population and achieve the objective of secondary prevention.
format	Online Article Text
id	pubmed-7333074
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-73330742020-07-06 Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation Yang, Tianzhou Zhang, Li Yi, Liwei Feng, Huawei Li, Shimeng Chen, Haoyu Zhu, Junfeng Zhao, Jian Zeng, Yingyue Liu, Hongsheng JMIR Med Inform Original Paper BACKGROUND: Early diabetes screening can effectively reduce the burden of disease. However, natural population–based screening projects require a large number of resources. With the emergence and development of machine learning, researchers have started to pursue more flexible and efficient methods to screen or predict type 2 diabetes. OBJECTIVE: The aim of this study was to build prediction models based on the ensemble learning method for diabetes screening to further improve the health status of the population in a noninvasive and inexpensive manner. METHODS: The dataset for building and evaluating the diabetes prediction model was extracted from the National Health and Nutrition Examination Survey from 2011-2016. After data cleaning and feature selection, the dataset was split into a training set (80%, 2011-2014), test set (20%, 2011-2014) and validation set (2015-2016). Three simple machine learning methods (linear discriminant analysis, support vector machine, and random forest) and easy ensemble methods were used to build diabetes prediction models. The performance of the models was evaluated through 5-fold cross-validation and external validation. The Delong test (2-sided) was used to test the performance differences between the models. RESULTS: We selected 8057 observations and 12 attributes from the database. In the 5-fold cross-validation, the three simple methods yielded highly predictive performance models with areas under the curve (AUCs) over 0.800, wherein the ensemble methods significantly outperformed the simple methods. When we evaluated the models in the test set and validation set, the same trends were observed. The ensemble model of linear discriminant analysis yielded the best performance, with an AUC of 0.849, an accuracy of 0.730, a sensitivity of 0.819, and a specificity of 0.709 in the validation set. CONCLUSIONS: This study indicates that efficient screening using machine learning methods with noninvasive tests can be applied to a large population and achieve the objective of secondary prevention. JMIR Publications 2020-06-18 /pmc/articles/PMC7333074/ /pubmed/32554386 http://dx.doi.org/10.2196/15431 Text en ©Tianzhou Yang, Li Zhang, Liwei Yi, Huawei Feng, Shimeng Li, Haoyu Chen, Junfeng Zhu, Jian Zhao, Yingyue Zeng, Hongsheng Liu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 18.06.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Yang, Tianzhou Zhang, Li Yi, Liwei Feng, Huawei Li, Shimeng Chen, Haoyu Zhu, Junfeng Zhao, Jian Zeng, Yingyue Liu, Hongsheng Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title	Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title_full	Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title_fullStr	Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title_full_unstemmed	Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title_short	Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation
title_sort	ensemble learning models based on noninvasive features for type 2 diabetes screening: model development and validation
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7333074/ https://www.ncbi.nlm.nih.gov/pubmed/32554386 http://dx.doi.org/10.2196/15431
work_keys_str_mv	AT yangtianzhou ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT zhangli ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT yiliwei ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT fenghuawei ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT lishimeng ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT chenhaoyu ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT zhujunfeng ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT zhaojian ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT zengyingyue ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation AT liuhongsheng ensemblelearningmodelsbasedonnoninvasivefeaturesfortype2diabetesscreeningmodeldevelopmentandvalidation

Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation

Ejemplares similares