Cargando…

Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes

BACKGROUND: Being able to predict with confidence the early onset of type 2 diabetes from a suite of signs and symptoms (features) displayed by potential sufferers is desirable to commence treatment promptly. Late or inconclusive diagnosis can result in more serious health consequences for sufferers...

Descripción completa

Detalles Bibliográficos
Autor principal: Wood, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9676132/
https://www.ncbi.nlm.nih.gov/pubmed/36420178
http://dx.doi.org/10.1002/cdt3.39
_version_ 1784833521015062528
author Wood, David A.
author_facet Wood, David A.
author_sort Wood, David A.
collection PubMed
description BACKGROUND: Being able to predict with confidence the early onset of type 2 diabetes from a suite of signs and symptoms (features) displayed by potential sufferers is desirable to commence treatment promptly. Late or inconclusive diagnosis can result in more serious health consequences for sufferers and higher costs for health care services in the long run. METHODS: A novel integrated methodology is proposed involving correlation, statistical analysis, machine learning, multi‐K‐fold cross‐validation, and confusion matrices to provide a reliable classification of diabetes‐positive and ‐negative individuals from a substantial suite of features. The method also identifies the relative influence of each feature on the diabetes diagnosis and highlights the most important ones. Ten statistical and machine learning methods are utilized to conduct the analysis. RESULTS: A published data set involving 520 individuals (Sylthet Diabetes Hospital, Bangladesh) is modeled revealing that a support vector classifier generates the most accurate early‐onset type 2 diabetes status predictions with just 11 misclassifications (2.1% error). Polydipsia and polyuria are among the most influential features, whereas obesity and age are assigned low weights by the prediction models. CONCLUSION: The proposed methodology can rapidly predict early‐onset type 2 diabetes with high confidence while providing valuable insight into the key influential features involved in such predictions.
format Online
Article
Text
id pubmed-9676132
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-96761322022-11-22 Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes Wood, David A. Chronic Dis Transl Med Original Articles BACKGROUND: Being able to predict with confidence the early onset of type 2 diabetes from a suite of signs and symptoms (features) displayed by potential sufferers is desirable to commence treatment promptly. Late or inconclusive diagnosis can result in more serious health consequences for sufferers and higher costs for health care services in the long run. METHODS: A novel integrated methodology is proposed involving correlation, statistical analysis, machine learning, multi‐K‐fold cross‐validation, and confusion matrices to provide a reliable classification of diabetes‐positive and ‐negative individuals from a substantial suite of features. The method also identifies the relative influence of each feature on the diabetes diagnosis and highlights the most important ones. Ten statistical and machine learning methods are utilized to conduct the analysis. RESULTS: A published data set involving 520 individuals (Sylthet Diabetes Hospital, Bangladesh) is modeled revealing that a support vector classifier generates the most accurate early‐onset type 2 diabetes status predictions with just 11 misclassifications (2.1% error). Polydipsia and polyuria are among the most influential features, whereas obesity and age are assigned low weights by the prediction models. CONCLUSION: The proposed methodology can rapidly predict early‐onset type 2 diabetes with high confidence while providing valuable insight into the key influential features involved in such predictions. John Wiley and Sons Inc. 2022-07-31 /pmc/articles/PMC9676132/ /pubmed/36420178 http://dx.doi.org/10.1002/cdt3.39 Text en © 2022 The Authors. Chronic Diseases and Translational Medicine published by John Wiley & Sons, Ltd on behalf of Chinese Medical Association. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Original Articles
Wood, David A.
Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title_full Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title_fullStr Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title_full_unstemmed Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title_short Integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
title_sort integrated statistical and machine learning analysis provides insight into key influencing symptoms for distinguishing early‐onset type 2 diabetes
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9676132/
https://www.ncbi.nlm.nih.gov/pubmed/36420178
http://dx.doi.org/10.1002/cdt3.39
work_keys_str_mv AT wooddavida integratedstatisticalandmachinelearninganalysisprovidesinsightintokeyinfluencingsymptomsfordistinguishingearlyonsettype2diabetes