Cargando…

Predicting Diabetes Mellitus With Machine Learning Techniques

Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world’s diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes....

Descripción completa

Detalles Bibliográficos
Autores principales: Zou, Quan, Qu, Kaiyang, Luo, Yamei, Yin, Dehui, Ju, Ying, Tang, Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6232260/
https://www.ncbi.nlm.nih.gov/pubmed/30459809
http://dx.doi.org/10.3389/fgene.2018.00515
_version_ 1783370371493265408
author Zou, Quan
Qu, Kaiyang
Luo, Yamei
Yin, Dehui
Ju, Ying
Tang, Hua
author_facet Zou, Quan
Qu, Kaiyang
Luo, Yamei
Yin, Dehui
Ju, Ying
Tang, Hua
author_sort Zou, Quan
collection PubMed
description Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world’s diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes. There is no doubt that this alarming figure needs great attention. With the rapid development of machine learning, machine learning has been applied to many aspects of medical health. In this study, we used decision tree, random forest and neural network to predict diabetes mellitus. The dataset is the hospital physical examination data in Luzhou, China. It contains 14 attributes. In this study, five-fold cross validation was used to examine the models. In order to verity the universal applicability of the methods, we chose some methods that have the better performance to conduct independent test experiments. We randomly selected 68994 healthy people and diabetic patients’ data, respectively as training set. Due to the data unbalance, we randomly extracted 5 times data. And the result is the average of these five experiments. In this study, we used principal component analysis (PCA) and minimum redundancy maximum relevance (mRMR) to reduce the dimensionality. The results showed that prediction with random forest could reach the highest accuracy (ACC = 0.8084) when all the attributes were used.
format Online
Article
Text
id pubmed-6232260
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-62322602018-11-20 Predicting Diabetes Mellitus With Machine Learning Techniques Zou, Quan Qu, Kaiyang Luo, Yamei Yin, Dehui Ju, Ying Tang, Hua Front Genet Genetics Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world’s diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes. There is no doubt that this alarming figure needs great attention. With the rapid development of machine learning, machine learning has been applied to many aspects of medical health. In this study, we used decision tree, random forest and neural network to predict diabetes mellitus. The dataset is the hospital physical examination data in Luzhou, China. It contains 14 attributes. In this study, five-fold cross validation was used to examine the models. In order to verity the universal applicability of the methods, we chose some methods that have the better performance to conduct independent test experiments. We randomly selected 68994 healthy people and diabetic patients’ data, respectively as training set. Due to the data unbalance, we randomly extracted 5 times data. And the result is the average of these five experiments. In this study, we used principal component analysis (PCA) and minimum redundancy maximum relevance (mRMR) to reduce the dimensionality. The results showed that prediction with random forest could reach the highest accuracy (ACC = 0.8084) when all the attributes were used. Frontiers Media S.A. 2018-11-06 /pmc/articles/PMC6232260/ /pubmed/30459809 http://dx.doi.org/10.3389/fgene.2018.00515 Text en Copyright © 2018 Zou, Qu, Luo, Yin, Ju and Tang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zou, Quan
Qu, Kaiyang
Luo, Yamei
Yin, Dehui
Ju, Ying
Tang, Hua
Predicting Diabetes Mellitus With Machine Learning Techniques
title Predicting Diabetes Mellitus With Machine Learning Techniques
title_full Predicting Diabetes Mellitus With Machine Learning Techniques
title_fullStr Predicting Diabetes Mellitus With Machine Learning Techniques
title_full_unstemmed Predicting Diabetes Mellitus With Machine Learning Techniques
title_short Predicting Diabetes Mellitus With Machine Learning Techniques
title_sort predicting diabetes mellitus with machine learning techniques
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6232260/
https://www.ncbi.nlm.nih.gov/pubmed/30459809
http://dx.doi.org/10.3389/fgene.2018.00515
work_keys_str_mv AT zouquan predictingdiabetesmellituswithmachinelearningtechniques
AT qukaiyang predictingdiabetesmellituswithmachinelearningtechniques
AT luoyamei predictingdiabetesmellituswithmachinelearningtechniques
AT yindehui predictingdiabetesmellituswithmachinelearningtechniques
AT juying predictingdiabetesmellituswithmachinelearningtechniques
AT tanghua predictingdiabetesmellituswithmachinelearningtechniques