Cargando…

Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study

BACKGROUND: Previous work on predicting type 2 diabetes by integrating clinical and genetic factors has mostly focused on the Western population. In this study, we use genome-wide polygenic risk score (gPRS) and serum metabolite data for type 2 diabetes risk prediction in the Asian population. METHO...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hahn, Seok-Ju, Kim, Suhyeon, Choi, Young Sik, Lee, Junghye, Kang, Jihun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2022
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9713286/ https://www.ncbi.nlm.nih.gov/pubmed/36462406 http://dx.doi.org/10.1016/j.ebiom.2022.104383

_version_	1784841986143944704
author	Hahn, Seok-Ju Kim, Suhyeon Choi, Young Sik Lee, Junghye Kang, Jihun
author_facet	Hahn, Seok-Ju Kim, Suhyeon Choi, Young Sik Lee, Junghye Kang, Jihun
author_sort	Hahn, Seok-Ju
collection	PubMed
description	BACKGROUND: Previous work on predicting type 2 diabetes by integrating clinical and genetic factors has mostly focused on the Western population. In this study, we use genome-wide polygenic risk score (gPRS) and serum metabolite data for type 2 diabetes risk prediction in the Asian population. METHODS: Data of 1425 participants from the Korean Genome and Epidemiology Study (KoGES) Ansan-Ansung cohort were used in this study. For gPRS analysis, genotypic and clinical information from KoGES health examinee (n = 58,701) and KoGES cardiovascular disease association (n = 8105) sub-cohorts were included. Linkage disequilibrium analysis identified 239,062 genetic variants that were used to determine the gPRS, while the metabolites were selected using the Boruta algorithm. We used bootstrapped cross-validation to evaluate logistic regression and random forest (RF)-based machine learning models. Finally, associations of gPRS and selected metabolites with the values of homeostatic model assessment of beta-cell function (HOMA-B) and insulin resistance (HOMA-IR) were further estimated. FINDINGS: During the follow-up period (8.3 ± 2.8 years), 331 participants (23.2%) were diagnosed with type 2 diabetes. The areas under the curves of the RF-based models were 0.844, 0.876, and 0.883 for the model using only demographic and clinical factors, model including the gPRS, and model with both gPRS and metabolites, respectively. Incorporation of additional parameters in the latter two models improved the classification by 11.7% and 4.2% respectively. While gPRS was significantly associated with HOMA-B value, most metabolites had a significant association with HOMA-IR value. INTERPRETATION: Incorporating both gPRS and metabolite data led to enhanced type 2 diabetes risk prediction by capturing distinct etiologies of type 2 diabetes development. An RF-based model using clinical factors, gPRS, and metabolites predicted type 2 diabetes risk more accurately than the logistic regression-based model. FUNDING: This work was supported by the 10.13039/501100003725National Research Foundation of Korea (NRF) grant funded by the Korean government (MEST) (No. 2019M3E5D1A02070863 and 2022R1C1C1005458). This work was also supported by the 2020 Research Fund (1.200098.01) of UNIST (Ulsan National Institute of Science & Technology)
format	Online Article Text
id	pubmed-9713286
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-97132862022-12-02 Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study Hahn, Seok-Ju Kim, Suhyeon Choi, Young Sik Lee, Junghye Kang, Jihun eBioMedicine Articles BACKGROUND: Previous work on predicting type 2 diabetes by integrating clinical and genetic factors has mostly focused on the Western population. In this study, we use genome-wide polygenic risk score (gPRS) and serum metabolite data for type 2 diabetes risk prediction in the Asian population. METHODS: Data of 1425 participants from the Korean Genome and Epidemiology Study (KoGES) Ansan-Ansung cohort were used in this study. For gPRS analysis, genotypic and clinical information from KoGES health examinee (n = 58,701) and KoGES cardiovascular disease association (n = 8105) sub-cohorts were included. Linkage disequilibrium analysis identified 239,062 genetic variants that were used to determine the gPRS, while the metabolites were selected using the Boruta algorithm. We used bootstrapped cross-validation to evaluate logistic regression and random forest (RF)-based machine learning models. Finally, associations of gPRS and selected metabolites with the values of homeostatic model assessment of beta-cell function (HOMA-B) and insulin resistance (HOMA-IR) were further estimated. FINDINGS: During the follow-up period (8.3 ± 2.8 years), 331 participants (23.2%) were diagnosed with type 2 diabetes. The areas under the curves of the RF-based models were 0.844, 0.876, and 0.883 for the model using only demographic and clinical factors, model including the gPRS, and model with both gPRS and metabolites, respectively. Incorporation of additional parameters in the latter two models improved the classification by 11.7% and 4.2% respectively. While gPRS was significantly associated with HOMA-B value, most metabolites had a significant association with HOMA-IR value. INTERPRETATION: Incorporating both gPRS and metabolite data led to enhanced type 2 diabetes risk prediction by capturing distinct etiologies of type 2 diabetes development. An RF-based model using clinical factors, gPRS, and metabolites predicted type 2 diabetes risk more accurately than the logistic regression-based model. FUNDING: This work was supported by the 10.13039/501100003725National Research Foundation of Korea (NRF) grant funded by the Korean government (MEST) (No. 2019M3E5D1A02070863 and 2022R1C1C1005458). This work was also supported by the 2020 Research Fund (1.200098.01) of UNIST (Ulsan National Institute of Science & Technology) Elsevier 2022-11-30 /pmc/articles/PMC9713286/ /pubmed/36462406 http://dx.doi.org/10.1016/j.ebiom.2022.104383 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle	Articles Hahn, Seok-Ju Kim, Suhyeon Choi, Young Sik Lee, Junghye Kang, Jihun Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title	Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title_full	Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title_fullStr	Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title_full_unstemmed	Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title_short	Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study
title_sort	prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: a machine learning analysis of population-based 10-year prospective cohort study
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9713286/ https://www.ncbi.nlm.nih.gov/pubmed/36462406 http://dx.doi.org/10.1016/j.ebiom.2022.104383
work_keys_str_mv	AT hahnseokju predictionoftype2diabetesusinggenomewidepolygenicriskscoreandmetabolicprofilesamachinelearninganalysisofpopulationbased10yearprospectivecohortstudy AT kimsuhyeon predictionoftype2diabetesusinggenomewidepolygenicriskscoreandmetabolicprofilesamachinelearninganalysisofpopulationbased10yearprospectivecohortstudy AT choiyoungsik predictionoftype2diabetesusinggenomewidepolygenicriskscoreandmetabolicprofilesamachinelearninganalysisofpopulationbased10yearprospectivecohortstudy AT leejunghye predictionoftype2diabetesusinggenomewidepolygenicriskscoreandmetabolicprofilesamachinelearninganalysisofpopulationbased10yearprospectivecohortstudy AT kangjihun predictionoftype2diabetesusinggenomewidepolygenicriskscoreandmetabolicprofilesamachinelearninganalysisofpopulationbased10yearprospectivecohortstudy

Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: A machine learning analysis of population-based 10-year prospective cohort study

Ejemplares similares