Cargando…

Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study

BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying in...

Descripción completa

Detalles Bibliográficos
Autores principales: Akbarzadeh, Mahdi, Alipour, Nadia, Moheimani, Hamed, Zahedi, Asieh Sadat, Hosseini-Esfahani, Firoozeh, Lanjanian, Hossein, Azizi, Fereidoun, Daneshpour, Maryam S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8994379/
https://www.ncbi.nlm.nih.gov/pubmed/35397593
http://dx.doi.org/10.1186/s12967-022-03349-z
_version_ 1784684095788285952
author Akbarzadeh, Mahdi
Alipour, Nadia
Moheimani, Hamed
Zahedi, Asieh Sadat
Hosseini-Esfahani, Firoozeh
Lanjanian, Hossein
Azizi, Fereidoun
Daneshpour, Maryam S.
author_facet Akbarzadeh, Mahdi
Alipour, Nadia
Moheimani, Hamed
Zahedi, Asieh Sadat
Hosseini-Esfahani, Firoozeh
Lanjanian, Hossein
Azizi, Fereidoun
Daneshpour, Maryam S.
author_sort Akbarzadeh, Mahdi
collection PubMed
description BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying influential genetic or environmental risk factors. METHODS: This candidate gene study was conducted on 4756 eligible participants from the Tehran Cardio-metabolic Genetic study (TCGS). We compared predictive models using logistic regression (LR), Random Forest (RF), decision tree (DT), support vector machines (SVM), and discriminant analyses. Demographic and clinical features, as well as variables regarding common GCKR gene polymorphisms, were included in the models. We used a 10-repeated tenfold cross-validation to evaluate model performance. RESULTS: 50.6% of participants had MetS. MetS was significantly associated with age, gender, schooling years, BMI, physical activity, rs780094, and rs780093 (P < 0.05) as indicated by LR. RF showed the best performance overall (AUC-ROC = 0.804, AUC-PR = 0.776, and Accuracy = 0.743) and indicated BMI, physical activity, and age to be the most influential model features. According to the DT, a person with BMI < 24 and physical activity < 8.8 possesses a 4% chance for MetS. In contrast, a person with BMI ≥ 25, physical activity < 2.7, and age ≥ 33, has 77% probability of suffering from MetS. CONCLUSION: Our findings indicated that, on average, machine learning models outperformed conventional statistical approaches for patient classification. These well-performing models may be used to develop future support systems that use a variety of data sources to identify persons at high risk of getting MetS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-022-03349-z.
format Online
Article
Text
id pubmed-8994379
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-89943792022-04-10 Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study Akbarzadeh, Mahdi Alipour, Nadia Moheimani, Hamed Zahedi, Asieh Sadat Hosseini-Esfahani, Firoozeh Lanjanian, Hossein Azizi, Fereidoun Daneshpour, Maryam S. J Transl Med Research BACKGROUND: Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying influential genetic or environmental risk factors. METHODS: This candidate gene study was conducted on 4756 eligible participants from the Tehran Cardio-metabolic Genetic study (TCGS). We compared predictive models using logistic regression (LR), Random Forest (RF), decision tree (DT), support vector machines (SVM), and discriminant analyses. Demographic and clinical features, as well as variables regarding common GCKR gene polymorphisms, were included in the models. We used a 10-repeated tenfold cross-validation to evaluate model performance. RESULTS: 50.6% of participants had MetS. MetS was significantly associated with age, gender, schooling years, BMI, physical activity, rs780094, and rs780093 (P < 0.05) as indicated by LR. RF showed the best performance overall (AUC-ROC = 0.804, AUC-PR = 0.776, and Accuracy = 0.743) and indicated BMI, physical activity, and age to be the most influential model features. According to the DT, a person with BMI < 24 and physical activity < 8.8 possesses a 4% chance for MetS. In contrast, a person with BMI ≥ 25, physical activity < 2.7, and age ≥ 33, has 77% probability of suffering from MetS. CONCLUSION: Our findings indicated that, on average, machine learning models outperformed conventional statistical approaches for patient classification. These well-performing models may be used to develop future support systems that use a variety of data sources to identify persons at high risk of getting MetS. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-022-03349-z. BioMed Central 2022-04-09 /pmc/articles/PMC8994379/ /pubmed/35397593 http://dx.doi.org/10.1186/s12967-022-03349-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Akbarzadeh, Mahdi
Alipour, Nadia
Moheimani, Hamed
Zahedi, Asieh Sadat
Hosseini-Esfahani, Firoozeh
Lanjanian, Hossein
Azizi, Fereidoun
Daneshpour, Maryam S.
Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title_full Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title_fullStr Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title_full_unstemmed Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title_short Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study
title_sort evaluating machine learning-powered classification algorithms which utilize variants in the gckr gene to predict metabolic syndrome: tehran cardio-metabolic genetics study
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8994379/
https://www.ncbi.nlm.nih.gov/pubmed/35397593
http://dx.doi.org/10.1186/s12967-022-03349-z
work_keys_str_mv AT akbarzadehmahdi evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT alipournadia evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT moheimanihamed evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT zahediasiehsadat evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT hosseiniesfahanifiroozeh evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT lanjanianhossein evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT azizifereidoun evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy
AT daneshpourmaryams evaluatingmachinelearningpoweredclassificationalgorithmswhichutilizevariantsinthegckrgenetopredictmetabolicsyndrometehrancardiometabolicgeneticsstudy