Cargando…
Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan
PURPOSE: Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census d...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876749/ https://www.ncbi.nlm.nih.gov/pubmed/36718178 http://dx.doi.org/10.1007/s12553-023-00730-w |
_version_ | 1784878231514513408 |
---|---|
author | Jiang, Pei Suzuki, Hiroyuki Obi, Takashi |
author_facet | Jiang, Pei Suzuki, Hiroyuki Obi, Takashi |
author_sort | Jiang, Pei |
collection | PubMed |
description | PURPOSE: Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen’s Survey of Living Conditions were analyzed using interpretable machine learning methods. METHODS: Seven interpretable machine learning methods were used to analysis Japan citizens’ census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi’s quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors. RESULTS: Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age. CONCLUSIONS: New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data. |
format | Online Article Text |
id | pubmed-9876749 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-98767492023-01-26 Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan Jiang, Pei Suzuki, Hiroyuki Obi, Takashi Health Technol (Berl) Original Paper PURPOSE: Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen’s Survey of Living Conditions were analyzed using interpretable machine learning methods. METHODS: Seven interpretable machine learning methods were used to analysis Japan citizens’ census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi’s quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors. RESULTS: Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age. CONCLUSIONS: New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data. Springer Berlin Heidelberg 2023-01-26 2023 /pmc/articles/PMC9876749/ /pubmed/36718178 http://dx.doi.org/10.1007/s12553-023-00730-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Paper Jiang, Pei Suzuki, Hiroyuki Obi, Takashi Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title | Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title_full | Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title_fullStr | Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title_full_unstemmed | Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title_short | Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan |
title_sort | interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of japan |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876749/ https://www.ncbi.nlm.nih.gov/pubmed/36718178 http://dx.doi.org/10.1007/s12553-023-00730-w |
work_keys_str_mv | AT jiangpei interpretablemachinelearninganalysistoidentifyriskfactorsfordiabetesusingtheanonymouslivingcensusdataofjapan AT suzukihiroyuki interpretablemachinelearninganalysistoidentifyriskfactorsfordiabetesusingtheanonymouslivingcensusdataofjapan AT obitakashi interpretablemachinelearninganalysistoidentifyriskfactorsfordiabetesusingtheanonymouslivingcensusdataofjapan |