Cargando…

IHCP: interpretable hepatitis C prediction system based on black-box machine learning models

BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Yongxian, Lu, Xiqian, Sun, Guicong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10481489/
https://www.ncbi.nlm.nih.gov/pubmed/37674125
http://dx.doi.org/10.1186/s12859-023-05456-0
_version_ 1785101985664466944
author Fan, Yongxian
Lu, Xiqian
Sun, Guicong
author_facet Fan, Yongxian
Lu, Xiqian
Sun, Guicong
author_sort Fan, Yongxian
collection PubMed
description BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent. RESULT: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model. CONCLUSION: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model's decision-making process.
format Online
Article
Text
id pubmed-10481489
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104814892023-09-07 IHCP: interpretable hepatitis C prediction system based on black-box machine learning models Fan, Yongxian Lu, Xiqian Sun, Guicong BMC Bioinformatics Research BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent. RESULT: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model. CONCLUSION: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model's decision-making process. BioMed Central 2023-09-06 /pmc/articles/PMC10481489/ /pubmed/37674125 http://dx.doi.org/10.1186/s12859-023-05456-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Fan, Yongxian
Lu, Xiqian
Sun, Guicong
IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title_full IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title_fullStr IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title_full_unstemmed IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title_short IHCP: interpretable hepatitis C prediction system based on black-box machine learning models
title_sort ihcp: interpretable hepatitis c prediction system based on black-box machine learning models
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10481489/
https://www.ncbi.nlm.nih.gov/pubmed/37674125
http://dx.doi.org/10.1186/s12859-023-05456-0
work_keys_str_mv AT fanyongxian ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels
AT luxiqian ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels
AT sunguicong ihcpinterpretablehepatitiscpredictionsystembasedonblackboxmachinelearningmodels