Cargando…

Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study

BACKGROUND: Methods of data mining and analytics can be efficiently applied in medicine to develop models that use patient-specific data to predict the development of diabetic polyneuropathy. However, there is room for improvement in the accuracy of predictive models. Existing studies of diabetes po...

Descripción completa

Detalles Bibliográficos
Autores principales: Metsker, Oleg, Magoev, Kirill, Yakovlev, Alexey, Yanishevskiy, Stanislav, Kopanitsa, Georgy, Kovalchuk, Sergey, Krzhizhanovskaya, Valeria V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7444272/
https://www.ncbi.nlm.nih.gov/pubmed/32831065
http://dx.doi.org/10.1186/s12911-020-01215-w
_version_ 1783573777328635904
author Metsker, Oleg
Magoev, Kirill
Yakovlev, Alexey
Yanishevskiy, Stanislav
Kopanitsa, Georgy
Kovalchuk, Sergey
Krzhizhanovskaya, Valeria V.
author_facet Metsker, Oleg
Magoev, Kirill
Yakovlev, Alexey
Yanishevskiy, Stanislav
Kopanitsa, Georgy
Kovalchuk, Sergey
Krzhizhanovskaya, Valeria V.
author_sort Metsker, Oleg
collection PubMed
description BACKGROUND: Methods of data mining and analytics can be efficiently applied in medicine to develop models that use patient-specific data to predict the development of diabetic polyneuropathy. However, there is room for improvement in the accuracy of predictive models. Existing studies of diabetes polyneuropathy considered a limited number of predictors in one study to enable a comparison of efficiency of different machine learning methods with different predictors to find the most efficient one. The purpose of this study is the implementation of machine learning methods for identifying the risk of diabetes polyneuropathy based on structured electronic medical records collected in databases of medical information systems. METHODS: For the purposes of our study, we developed a structured procedure for predictive modelling, which includes data extraction and preprocessing, model adjustment and performance assessment, selection of the best models and interpretation of results. The dataset contained a total number of 238,590 laboratory records. Each record 27 laboratory tests, age, gender and presence of retinopathy or nephropathy). The records included information about 5846 patients with diabetes. Diagnosis served as a source of information about the target class values for classification. RESULTS: It was discovered that inclusion of two expressions, namely “nephropathy” and “retinopathy” allows to increase the performance, achieving up to 79.82% precision, 81.52% recall, 80.64% F1 score, 82.61% accuracy, and 89.88% AUC using the neural network classifier. Additionally, different models showed different results in terms of interpretation significance: random forest confirmed that the most important risk factor for polyneuropathy is the increased neutrophil level, meaning the presence of inflammation in the body. Linear models showed linear dependencies of the presence of polyneuropathy on blood glucose levels, which is confirmed by the clinical interpretation of the importance of blood glucose control. CONCLUSION: Depending on whether one needs to identify pathophysiological mechanisms for one’s prospective study or identify early or late predictors, the choice of model will vary. In comparison with the previous studies, our research makes a comprehensive comparison of different decisions using a large and well-structured dataset applied to different decision support tasks.
format Online
Article
Text
id pubmed-7444272
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74442722020-08-26 Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study Metsker, Oleg Magoev, Kirill Yakovlev, Alexey Yanishevskiy, Stanislav Kopanitsa, Georgy Kovalchuk, Sergey Krzhizhanovskaya, Valeria V. BMC Med Inform Decis Mak Research Article BACKGROUND: Methods of data mining and analytics can be efficiently applied in medicine to develop models that use patient-specific data to predict the development of diabetic polyneuropathy. However, there is room for improvement in the accuracy of predictive models. Existing studies of diabetes polyneuropathy considered a limited number of predictors in one study to enable a comparison of efficiency of different machine learning methods with different predictors to find the most efficient one. The purpose of this study is the implementation of machine learning methods for identifying the risk of diabetes polyneuropathy based on structured electronic medical records collected in databases of medical information systems. METHODS: For the purposes of our study, we developed a structured procedure for predictive modelling, which includes data extraction and preprocessing, model adjustment and performance assessment, selection of the best models and interpretation of results. The dataset contained a total number of 238,590 laboratory records. Each record 27 laboratory tests, age, gender and presence of retinopathy or nephropathy). The records included information about 5846 patients with diabetes. Diagnosis served as a source of information about the target class values for classification. RESULTS: It was discovered that inclusion of two expressions, namely “nephropathy” and “retinopathy” allows to increase the performance, achieving up to 79.82% precision, 81.52% recall, 80.64% F1 score, 82.61% accuracy, and 89.88% AUC using the neural network classifier. Additionally, different models showed different results in terms of interpretation significance: random forest confirmed that the most important risk factor for polyneuropathy is the increased neutrophil level, meaning the presence of inflammation in the body. Linear models showed linear dependencies of the presence of polyneuropathy on blood glucose levels, which is confirmed by the clinical interpretation of the importance of blood glucose control. CONCLUSION: Depending on whether one needs to identify pathophysiological mechanisms for one’s prospective study or identify early or late predictors, the choice of model will vary. In comparison with the previous studies, our research makes a comprehensive comparison of different decisions using a large and well-structured dataset applied to different decision support tasks. BioMed Central 2020-08-24 /pmc/articles/PMC7444272/ /pubmed/32831065 http://dx.doi.org/10.1186/s12911-020-01215-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Metsker, Oleg
Magoev, Kirill
Yakovlev, Alexey
Yanishevskiy, Stanislav
Kopanitsa, Georgy
Kovalchuk, Sergey
Krzhizhanovskaya, Valeria V.
Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title_full Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title_fullStr Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title_full_unstemmed Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title_short Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
title_sort identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7444272/
https://www.ncbi.nlm.nih.gov/pubmed/32831065
http://dx.doi.org/10.1186/s12911-020-01215-w
work_keys_str_mv AT metskeroleg identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT magoevkirill identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT yakovlevalexey identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT yanishevskiystanislav identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT kopanitsageorgy identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT kovalchuksergey identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy
AT krzhizhanovskayavaleriav identificationofriskfactorsforpatientswithdiabetesdiabeticpolyneuropathycasestudy