Cargando…

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the...

Descripción completa

Detalles Bibliográficos
Autores principales: Vyas, Abhishek, Raman, Sundaresan, Sen, Sagnik, Ramasamy, Kim, Rajalakshmi, Ramachandran, Mohan, Viswanathan, Raman, Rajiv
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10297706/
https://www.ncbi.nlm.nih.gov/pubmed/37370980
http://dx.doi.org/10.3390/diagnostics13122084
_version_ 1785063945281732608
author Vyas, Abhishek
Raman, Sundaresan
Sen, Sagnik
Ramasamy, Kim
Rajalakshmi, Ramachandran
Mohan, Viswanathan
Raman, Rajiv
author_facet Vyas, Abhishek
Raman, Sundaresan
Sen, Sagnik
Ramasamy, Kim
Rajalakshmi, Ramachandran
Mohan, Viswanathan
Raman, Rajiv
author_sort Vyas, Abhishek
collection PubMed
description This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier’s importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy.
format Online
Article
Text
id pubmed-10297706
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102977062023-06-28 Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India Vyas, Abhishek Raman, Sundaresan Sen, Sagnik Ramasamy, Kim Rajalakshmi, Ramachandran Mohan, Viswanathan Raman, Rajiv Diagnostics (Basel) Article This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier’s importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy. MDPI 2023-06-16 /pmc/articles/PMC10297706/ /pubmed/37370980 http://dx.doi.org/10.3390/diagnostics13122084 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Vyas, Abhishek
Raman, Sundaresan
Sen, Sagnik
Ramasamy, Kim
Rajalakshmi, Ramachandran
Mohan, Viswanathan
Raman, Rajiv
Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title_full Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title_fullStr Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title_full_unstemmed Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title_short Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
title_sort machine learning-based diagnosis and ranking of risk factors for diabetic retinopathy in population-based studies from south india
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10297706/
https://www.ncbi.nlm.nih.gov/pubmed/37370980
http://dx.doi.org/10.3390/diagnostics13122084
work_keys_str_mv AT vyasabhishek machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT ramansundaresan machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT sensagnik machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT ramasamykim machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT rajalakshmiramachandran machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT mohanviswanathan machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia
AT ramanrajiv machinelearningbaseddiagnosisandrankingofriskfactorsfordiabeticretinopathyinpopulationbasedstudiesfromsouthindia