Cargando…
IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10481489/ https://www.ncbi.nlm.nih.gov/pubmed/37674125 http://dx.doi.org/10.1186/s12859-023-05456-0 |
_version_ | 1785101985664466944 |
---|---|
author | Fan, Yongxian Lu, Xiqian Sun, Guicong |
author_facet | Fan, Yongxian Lu, Xiqian Sun, Guicong |
author_sort | Fan, Yongxian |
collection | PubMed |
description | BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent. RESULT: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model. CONCLUSION: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model's decision-making process. |
format | Online Article Text |
id | pubmed-10481489 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104814892023-09-07 IHCP: interpretable hepatitis C prediction system based on black-box machine learning models Fan, Yongxian Lu, Xiqian Sun, Guicong BMC Bioinformatics Research BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent. RESULT: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model. CONCLUSION: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model's decision-making process. BioMed Central 2023-09-06 /pmc/articles/PMC10481489/ /pubmed/37674125 http://dx.doi.org/10.1186/s12859-023-05456-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Fan, Yongxian Lu, Xiqian Sun, Guicong IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title | IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title_full | IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title_fullStr | IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title_full_unstemmed | IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title_short | IHCP: interpretable hepatitis C prediction system based on black-box machine learning models |
title_sort | ihcp: interpretable hepatitis c prediction system based on black-box machine learning models |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10481489/ https://www.ncbi.nlm.nih.gov/pubmed/37674125 http://dx.doi.org/10.1186/s12859-023-05456-0 |
work_keys_str_mv | AT fanyongxian ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels AT luxiqian ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels AT sunguicong ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels |