Cargando…

Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms

BACKGROUND: The risk factors of diabetic retinopathy (DR) were investigated extensively in the past studies, but it remains unknown which risk factors were more associated with the DR than others. If we can detect the DR related risk factors more accurately, we can then exercise early prevention str...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsao, Hsin-Yi, Chan, Pei-Ying, Su, Emily Chia-Yu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101083/ https://www.ncbi.nlm.nih.gov/pubmed/30367589 http://dx.doi.org/10.1186/s12859-018-2277-0

_version_	1783348984699420672
author	Tsao, Hsin-Yi Chan, Pei-Ying Su, Emily Chia-Yu
author_facet	Tsao, Hsin-Yi Chan, Pei-Ying Su, Emily Chia-Yu
author_sort	Tsao, Hsin-Yi
collection	PubMed
description	BACKGROUND: The risk factors of diabetic retinopathy (DR) were investigated extensively in the past studies, but it remains unknown which risk factors were more associated with the DR than others. If we can detect the DR related risk factors more accurately, we can then exercise early prevention strategies for diabetic retinopathy in the most high-risk population. The purpose of this study is to build a prediction model for the DR in type 2 diabetes mellitus using data mining techniques including the support vector machines, decision trees, artificial neural networks, and logistic regressions. RESULTS: Experimental results demonstrated that prediction performance by support vector machines performed better than the other machine learning algorithms and achieved 79.5% and 0.839 in accuracy and area under the receiver operating characteristic curve using percentage split (i.e., data set divided into 80% as trainning and 20% as test), respectively. Evaluated by three-way data split scheme (i.e., data set divided into 60% as training, 20% as validation, and 20% as independent test), our method obtained slightly lower performance compared to percentage split, which suggested that three-way data split is a better way to evaluate the real performance and prevent overestimation. Moreover, we incorporated approaches proposed in previous studies to evaluate our data set and our prediction performance outperformed the other previous studies in most evaluation measures. This lends support to our assumption that appropriate machine learning algorithms combined with discriminative clinical features can effectively detect diabetic retinopathy. CONCLUSIONS: Our method identifies use of insulin and duration of diabetes as novel interpretable features to assist with clinical decisions in identifying the high-risk populations for diabetic retinopathy. If duration of DM increases by 1 year, the odds ratio to have DMR is increased by 9.3%. The odds ratio to have DR is increased by 3.561 times for patients who use insulin compared to patients who do not use insulin. Our results can be used to facilitate development of clinical decision support systems for clinical practice in the future.
format	Online Article Text
id	pubmed-6101083
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-61010832018-08-27 Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms Tsao, Hsin-Yi Chan, Pei-Ying Su, Emily Chia-Yu BMC Bioinformatics Research BACKGROUND: The risk factors of diabetic retinopathy (DR) were investigated extensively in the past studies, but it remains unknown which risk factors were more associated with the DR than others. If we can detect the DR related risk factors more accurately, we can then exercise early prevention strategies for diabetic retinopathy in the most high-risk population. The purpose of this study is to build a prediction model for the DR in type 2 diabetes mellitus using data mining techniques including the support vector machines, decision trees, artificial neural networks, and logistic regressions. RESULTS: Experimental results demonstrated that prediction performance by support vector machines performed better than the other machine learning algorithms and achieved 79.5% and 0.839 in accuracy and area under the receiver operating characteristic curve using percentage split (i.e., data set divided into 80% as trainning and 20% as test), respectively. Evaluated by three-way data split scheme (i.e., data set divided into 60% as training, 20% as validation, and 20% as independent test), our method obtained slightly lower performance compared to percentage split, which suggested that three-way data split is a better way to evaluate the real performance and prevent overestimation. Moreover, we incorporated approaches proposed in previous studies to evaluate our data set and our prediction performance outperformed the other previous studies in most evaluation measures. This lends support to our assumption that appropriate machine learning algorithms combined with discriminative clinical features can effectively detect diabetic retinopathy. CONCLUSIONS: Our method identifies use of insulin and duration of diabetes as novel interpretable features to assist with clinical decisions in identifying the high-risk populations for diabetic retinopathy. If duration of DM increases by 1 year, the odds ratio to have DMR is increased by 9.3%. The odds ratio to have DR is increased by 3.561 times for patients who use insulin compared to patients who do not use insulin. Our results can be used to facilitate development of clinical decision support systems for clinical practice in the future. BioMed Central 2018-08-13 /pmc/articles/PMC6101083/ /pubmed/30367589 http://dx.doi.org/10.1186/s12859-018-2277-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Tsao, Hsin-Yi Chan, Pei-Ying Su, Emily Chia-Yu Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title	Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title_full	Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title_fullStr	Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title_full_unstemmed	Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title_short	Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
title_sort	predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101083/ https://www.ncbi.nlm.nih.gov/pubmed/30367589 http://dx.doi.org/10.1186/s12859-018-2277-0
work_keys_str_mv	AT tsaohsinyi predictingdiabeticretinopathyandidentifyinginterpretablebiomedicalfeaturesusingmachinelearningalgorithms AT chanpeiying predictingdiabeticretinopathyandidentifyinginterpretablebiomedicalfeaturesusingmachinelearningalgorithms AT suemilychiayu predictingdiabeticretinopathyandidentifyinginterpretablebiomedicalfeaturesusingmachinelearningalgorithms

Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms

Ejemplares similares