Cargando…
Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/ https://www.ncbi.nlm.nih.gov/pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9 |
_version_ | 1783738104916475904 |
---|---|
author | Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong |
author_facet | Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong |
author_sort | Fan, Shuanglong |
collection | PubMed |
description | BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine. |
format | Online Article Text |
id | pubmed-8362168 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-83621682021-08-17 Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong BioData Min Research BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine. BioMed Central 2021-08-13 /pmc/articles/PMC8362168/ /pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title | Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title_full | Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title_fullStr | Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title_full_unstemmed | Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title_short | Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma |
title_sort | probability calibration-based prediction of recurrence rate in patients with diffuse large b-cell lymphoma |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/ https://www.ncbi.nlm.nih.gov/pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9 |
work_keys_str_mv | AT fanshuanglong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhaozhiqiang probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhangyanbo probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT yuhongmei probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhengchuchu probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT huangxueqian probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT yangzhenhuan probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT xingmeng probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT luqing probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT luoyanhong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma |