Cargando…

Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries

BACKGROUND: Hypertension is the most common modifiable risk factor for cardiovascular diseases in South Asia. Machine learning (ML) models have been shown to outperform clinical risk predictions compared to statistical methods, but studies using ML to predict hypertension at the population level are...

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, Sheikh Mohammed Shariful, Talukder, Ashis, Awal, Md. Abdul, Siddiqui, Md. Muhammad Umer, Ahamad, Md. Martuza, Ahammed, Benojir, Rawal, Lal B., Alizadehsani, Roohallah, Abawajy, Jemal, Laranjo, Liliana, Chow, Clara K., Maddison, Ralph
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9008259/
https://www.ncbi.nlm.nih.gov/pubmed/35433854
http://dx.doi.org/10.3389/fcvm.2022.839379
_version_ 1784687011127361536
author Islam, Sheikh Mohammed Shariful
Talukder, Ashis
Awal, Md. Abdul
Siddiqui, Md. Muhammad Umer
Ahamad, Md. Martuza
Ahammed, Benojir
Rawal, Lal B.
Alizadehsani, Roohallah
Abawajy, Jemal
Laranjo, Liliana
Chow, Clara K.
Maddison, Ralph
author_facet Islam, Sheikh Mohammed Shariful
Talukder, Ashis
Awal, Md. Abdul
Siddiqui, Md. Muhammad Umer
Ahamad, Md. Martuza
Ahammed, Benojir
Rawal, Lal B.
Alizadehsani, Roohallah
Abawajy, Jemal
Laranjo, Liliana
Chow, Clara K.
Maddison, Ralph
author_sort Islam, Sheikh Mohammed Shariful
collection PubMed
description BACKGROUND: Hypertension is the most common modifiable risk factor for cardiovascular diseases in South Asia. Machine learning (ML) models have been shown to outperform clinical risk predictions compared to statistical methods, but studies using ML to predict hypertension at the population level are lacking. This study used ML approaches in a dataset of three South Asian countries to predict hypertension and its associated factors and compared the model's performances. METHODS: We conducted a retrospective study using ML analyses to detect hypertension using population-based surveys. We created a single dataset by harmonizing individual-level data from the most recent nationally representative Demographic and Health Survey in Bangladesh, Nepal, and India. The variables included blood pressure (BP), sociodemographic and economic factors, height, weight, hemoglobin, and random blood glucose. Hypertension was defined based on JNC-7 criteria. We applied six common ML-based classifiers: decision tree (DT), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), logistic regression (LR), and linear discriminant analysis (LDA) to predict hypertension and its risk factors. RESULTS: Of the 8,18,603 participants, 82,748 (10.11%) had hypertension. ML models showed that significant factors for hypertension were age and BMI. Ever measured BP, education, taking medicine to lower BP, and doctor's perception of high BP was also significant but comparatively lower than age and BMI. XGBoost, GBM, LR, and LDA showed the highest accuracy score of 90%, RF and DT achieved 89 and 83%, respectively, to predict hypertension. DT achieved the precision value of 91%, and the rest performed with 90%. XGBoost, GBM, LR, and LDA achieved a recall value of 100%, RF scored 99%, and DT scored 90%. In F1-score, XGBoost, GBM, LR, and LDA scored 95%, while RF scored 94%, and DT scored 90%. All the algorithms performed with good and small log loss values <6%. CONCLUSION: ML models performed well to predict hypertension and its associated factors in South Asians. When employed on an open-source platform, these models are scalable to millions of people and might help individuals self-screen for hypertension at an early stage. Future studies incorporating biochemical markers are needed to improve the ML algorithms and evaluate them in real life.
format Online
Article
Text
id pubmed-9008259
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-90082592022-04-15 Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries Islam, Sheikh Mohammed Shariful Talukder, Ashis Awal, Md. Abdul Siddiqui, Md. Muhammad Umer Ahamad, Md. Martuza Ahammed, Benojir Rawal, Lal B. Alizadehsani, Roohallah Abawajy, Jemal Laranjo, Liliana Chow, Clara K. Maddison, Ralph Front Cardiovasc Med Cardiovascular Medicine BACKGROUND: Hypertension is the most common modifiable risk factor for cardiovascular diseases in South Asia. Machine learning (ML) models have been shown to outperform clinical risk predictions compared to statistical methods, but studies using ML to predict hypertension at the population level are lacking. This study used ML approaches in a dataset of three South Asian countries to predict hypertension and its associated factors and compared the model's performances. METHODS: We conducted a retrospective study using ML analyses to detect hypertension using population-based surveys. We created a single dataset by harmonizing individual-level data from the most recent nationally representative Demographic and Health Survey in Bangladesh, Nepal, and India. The variables included blood pressure (BP), sociodemographic and economic factors, height, weight, hemoglobin, and random blood glucose. Hypertension was defined based on JNC-7 criteria. We applied six common ML-based classifiers: decision tree (DT), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), logistic regression (LR), and linear discriminant analysis (LDA) to predict hypertension and its risk factors. RESULTS: Of the 8,18,603 participants, 82,748 (10.11%) had hypertension. ML models showed that significant factors for hypertension were age and BMI. Ever measured BP, education, taking medicine to lower BP, and doctor's perception of high BP was also significant but comparatively lower than age and BMI. XGBoost, GBM, LR, and LDA showed the highest accuracy score of 90%, RF and DT achieved 89 and 83%, respectively, to predict hypertension. DT achieved the precision value of 91%, and the rest performed with 90%. XGBoost, GBM, LR, and LDA achieved a recall value of 100%, RF scored 99%, and DT scored 90%. In F1-score, XGBoost, GBM, LR, and LDA scored 95%, while RF scored 94%, and DT scored 90%. All the algorithms performed with good and small log loss values <6%. CONCLUSION: ML models performed well to predict hypertension and its associated factors in South Asians. When employed on an open-source platform, these models are scalable to millions of people and might help individuals self-screen for hypertension at an early stage. Future studies incorporating biochemical markers are needed to improve the ML algorithms and evaluate them in real life. Frontiers Media S.A. 2022-03-31 /pmc/articles/PMC9008259/ /pubmed/35433854 http://dx.doi.org/10.3389/fcvm.2022.839379 Text en Copyright © 2022 Islam, Talukder, Awal, Siddiqui, Ahamad, Ahammed, Rawal, Alizadehsani, Abawajy, Laranjo, Chow and Maddison. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cardiovascular Medicine
Islam, Sheikh Mohammed Shariful
Talukder, Ashis
Awal, Md. Abdul
Siddiqui, Md. Muhammad Umer
Ahamad, Md. Martuza
Ahammed, Benojir
Rawal, Lal B.
Alizadehsani, Roohallah
Abawajy, Jemal
Laranjo, Liliana
Chow, Clara K.
Maddison, Ralph
Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title_full Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title_fullStr Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title_full_unstemmed Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title_short Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries
title_sort machine learning approaches for predicting hypertension and its associated factors using population-level data from three south asian countries
topic Cardiovascular Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9008259/
https://www.ncbi.nlm.nih.gov/pubmed/35433854
http://dx.doi.org/10.3389/fcvm.2022.839379
work_keys_str_mv AT islamsheikhmohammedshariful machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT talukderashis machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT awalmdabdul machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT siddiquimdmuhammadumer machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT ahamadmdmartuza machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT ahammedbenojir machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT rawallalb machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT alizadehsaniroohallah machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT abawajyjemal machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT laranjoliliana machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT chowclarak machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries
AT maddisonralph machinelearningapproachesforpredictinghypertensionanditsassociatedfactorsusingpopulationleveldatafromthreesouthasiancountries