Cargando…
Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying in...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8994379/ https://www.ncbi.nlm.nih.gov/pubmed/35397593 http://dx.doi.org/10.1186/s12967-022-03349-z |
_version_ | 1784684095788285952 |
---|---|
author | Akbarzadeh, Mahdi Alipour, Nadia Moheimani, Hamed Zahedi, Asieh Sadat Hosseini-Esfahani, Firoozeh Lanjanian, Hossein Azizi, Fereidoun Daneshpour, Maryam S. |
author_facet | Akbarzadeh, Mahdi Alipour, Nadia Moheimani, Hamed Zahedi, Asieh Sadat Hosseini-Esfahani, Firoozeh Lanjanian, Hossein Azizi, Fereidoun Daneshpour, Maryam S. |
author_sort | Akbarzadeh, Mahdi |
collection | PubMed |
description | BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying influential genetic or environmental risk factors. METHODS: This candidate gene study was conducted on 4756 eligible participants from the Tehran Cardio-metabolic Genetic study (TCGS). We compared predictive models using logistic regression (LR), Random Forest (RF), decision tree (DT), support vector machines (SVM), and discriminant analyses. Demographic and clinical features, as well as variables regarding common GCKR gene polymorphisms, were included in the models. We used a 10-repeated tenfold cross-validation to evaluate model performance. RESULTS: 50.6% of participants had MetS. MetS was significantly associated with age, gender, schooling years, BMI, physical activity, rs780094, and rs780093 (P < 0.05) as indicated by LR. RF showed the best performance overall (AUC-ROC = 0.804, AUC-PR = 0.776, and Accuracy = 0.743) and indicated BMI, physical activity, and age to be the most influential model features. According to the DT, a person with BMI < 24 and physical activity < 8.8 possesses a 4% chance for MetS. In contrast, a person with BMI ≥ 25, physical activity < 2.7, and age ≥ 33, has 77% probability of suffering from MetS. CONCLUSION: Our findings indicated that, on average, machine learning models outperformed conventional statistical approaches for patient classification. These well-performing models may be used to develop future support systems that use a variety of data sources to identify persons at high risk of getting MetS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-022-03349-z. |
format | Online Article Text |
id | pubmed-8994379 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-89943792022-04-10 Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study Akbarzadeh, Mahdi Alipour, Nadia Moheimani, Hamed Zahedi, Asieh Sadat Hosseini-Esfahani, Firoozeh Lanjanian, Hossein Azizi, Fereidoun Daneshpour, Maryam S. J Transl Med Research BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying influential genetic or environmental risk factors. METHODS: This candidate gene study was conducted on 4756 eligible participants from the Tehran Cardio-metabolic Genetic study (TCGS). We compared predictive models using logistic regression (LR), Random Forest (RF), decision tree (DT), support vector machines (SVM), and discriminant analyses. Demographic and clinical features, as well as variables regarding common GCKR gene polymorphisms, were included in the models. We used a 10-repeated tenfold cross-validation to evaluate model performance. RESULTS: 50.6% of participants had MetS. MetS was significantly associated with age, gender, schooling years, BMI, physical activity, rs780094, and rs780093 (P < 0.05) as indicated by LR. RF showed the best performance overall (AUC-ROC = 0.804, AUC-PR = 0.776, and Accuracy = 0.743) and indicated BMI, physical activity, and age to be the most influential model features. According to the DT, a person with BMI < 24 and physical activity < 8.8 possesses a 4% chance for MetS. In contrast, a person with BMI ≥ 25, physical activity < 2.7, and age ≥ 33, has 77% probability of suffering from MetS. CONCLUSION: Our findings indicated that, on average, machine learning models outperformed conventional statistical approaches for patient classification. These well-performing models may be used to develop future support systems that use a variety of data sources to identify persons at high risk of getting MetS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-022-03349-z. BioMed Central 2022-04-09 /pmc/articles/PMC8994379/ /pubmed/35397593 http://dx.doi.org/10.1186/s12967-022-03349-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Akbarzadeh, Mahdi Alipour, Nadia Moheimani, Hamed Zahedi, Asieh Sadat Hosseini-Esfahani, Firoozeh Lanjanian, Hossein Azizi, Fereidoun Daneshpour, Maryam S. Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title | Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title_full | Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title_fullStr | Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title_full_unstemmed | Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title_short | Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study |
title_sort | evaluating machine learning-powered classification algorithms which utilize variants in the gckr gene to predict metabolic syndrome: tehran cardio-metabolic genetics study |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8994379/ https://www.ncbi.nlm.nih.gov/pubmed/35397593 http://dx.doi.org/10.1186/s12967-022-03349-z |
work_keys_str_mv | AT akbarzadehmahdi evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT alipournadia evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT moheimanihamed evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT zahediasiehsadat evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT hosseiniesfahanifiroozeh evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT lanjanianhossein evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT azizifereidoun evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy AT daneshpourmaryams evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy |