Cargando…

Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score

Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindranc...

Descripción completa

Detalles Bibliográficos
Autores principales: Sajid, Mirza Rizwan, Khan, Arshad Ali, Albar, Haitham M., Muhammad, Noryanti, Sami, Waqas, Bukhari, Syed Ahmad Chan, Wajahat, Iram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119773/
https://www.ncbi.nlm.nih.gov/pubmed/35602638
http://dx.doi.org/10.1155/2022/5475313
_version_ 1784710764276219904
author Sajid, Mirza Rizwan
Khan, Arshad Ali
Albar, Haitham M.
Muhammad, Noryanti
Sami, Waqas
Bukhari, Syed Ahmad Chan
Wajahat, Iram
author_facet Sajid, Mirza Rizwan
Khan, Arshad Ali
Albar, Haitham M.
Muhammad, Noryanti
Sami, Waqas
Bukhari, Syed Ahmad Chan
Wajahat, Iram
author_sort Sajid, Mirza Rizwan
collection PubMed
description Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindrance in their application. Therefore, in highly sensitive decisions, black boxes of ML models are not recommended. We proposed a novel methodology that uses complex supervised ML models and transforms them into simple, interpretable, transparent statistical models. This methodology is like stacking ensemble ML in which the best ML models are used as a base learner to compute relative feature weights. The index of these weights is further used as a single covariate in the simple logistic regression model to estimate the likelihood of an event. We tested this methodology on the primary dataset related to cardiovascular diseases (CVDs), the leading cause of mortalities in recent times. Therefore, early risk assessment is an important dimension that can potentially reduce the burden of CVDs and their related mortality through accurate but interpretable risk prediction models. We developed an artificial neural network and support vector machines based on ML models and transformed them into a simple statistical model and heart risk scores. These simplified models were found transparent, reliable, valid, interpretable, and approximate in predictions. The findings of this study suggest that complex supervised ML models can be efficiently transformed into simple statistical models that can also be validated.
format Online
Article
Text
id pubmed-9119773
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-91197732022-05-20 Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score Sajid, Mirza Rizwan Khan, Arshad Ali Albar, Haitham M. Muhammad, Noryanti Sami, Waqas Bukhari, Syed Ahmad Chan Wajahat, Iram Comput Intell Neurosci Research Article Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindrance in their application. Therefore, in highly sensitive decisions, black boxes of ML models are not recommended. We proposed a novel methodology that uses complex supervised ML models and transforms them into simple, interpretable, transparent statistical models. This methodology is like stacking ensemble ML in which the best ML models are used as a base learner to compute relative feature weights. The index of these weights is further used as a single covariate in the simple logistic regression model to estimate the likelihood of an event. We tested this methodology on the primary dataset related to cardiovascular diseases (CVDs), the leading cause of mortalities in recent times. Therefore, early risk assessment is an important dimension that can potentially reduce the burden of CVDs and their related mortality through accurate but interpretable risk prediction models. We developed an artificial neural network and support vector machines based on ML models and transformed them into a simple statistical model and heart risk scores. These simplified models were found transparent, reliable, valid, interpretable, and approximate in predictions. The findings of this study suggest that complex supervised ML models can be efficiently transformed into simple statistical models that can also be validated. Hindawi 2022-05-12 /pmc/articles/PMC9119773/ /pubmed/35602638 http://dx.doi.org/10.1155/2022/5475313 Text en Copyright © 2022 Mirza Rizwan Sajid et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Sajid, Mirza Rizwan
Khan, Arshad Ali
Albar, Haitham M.
Muhammad, Noryanti
Sami, Waqas
Bukhari, Syed Ahmad Chan
Wajahat, Iram
Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title_full Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title_fullStr Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title_full_unstemmed Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title_short Exploration of Black Boxes of Supervised Machine Learning Models: A Demonstration on Development of Predictive Heart Risk Score
title_sort exploration of black boxes of supervised machine learning models: a demonstration on development of predictive heart risk score
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119773/
https://www.ncbi.nlm.nih.gov/pubmed/35602638
http://dx.doi.org/10.1155/2022/5475313
work_keys_str_mv AT sajidmirzarizwan explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT khanarshadali explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT albarhaithamm explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT muhammadnoryanti explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT samiwaqas explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT bukharisyedahmadchan explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore
AT wajahatiram explorationofblackboxesofsupervisedmachinelearningmodelsademonstrationondevelopmentofpredictiveheartriskscore