Cargando…

Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma

BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Shuanglong, Zhao, Zhiqiang, Zhang, Yanbo, Yu, Hongmei, Zheng, Chuchu, Huang, Xueqian, Yang, Zhenhuan, Xing, Meng, Lu, Qing, Luo, Yanhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/
https://www.ncbi.nlm.nih.gov/pubmed/34389029
http://dx.doi.org/10.1186/s13040-021-00272-9
_version_ 1783738104916475904
author Fan, Shuanglong
Zhao, Zhiqiang
Zhang, Yanbo
Yu, Hongmei
Zheng, Chuchu
Huang, Xueqian
Yang, Zhenhuan
Xing, Meng
Lu, Qing
Luo, Yanhong
author_facet Fan, Shuanglong
Zhao, Zhiqiang
Zhang, Yanbo
Yu, Hongmei
Zheng, Chuchu
Huang, Xueqian
Yang, Zhenhuan
Xing, Meng
Lu, Qing
Luo, Yanhong
author_sort Fan, Shuanglong
collection PubMed
description BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine.
format Online
Article
Text
id pubmed-8362168
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83621682021-08-17 Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong BioData Min Research BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine. BioMed Central 2021-08-13 /pmc/articles/PMC8362168/ /pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Fan, Shuanglong
Zhao, Zhiqiang
Zhang, Yanbo
Yu, Hongmei
Zheng, Chuchu
Huang, Xueqian
Yang, Zhenhuan
Xing, Meng
Lu, Qing
Luo, Yanhong
Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_full Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_fullStr Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_full_unstemmed Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_short Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_sort probability calibration-based prediction of recurrence rate in patients with diffuse large b-cell lymphoma
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/
https://www.ncbi.nlm.nih.gov/pubmed/34389029
http://dx.doi.org/10.1186/s13040-021-00272-9
work_keys_str_mv AT fanshuanglong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT zhaozhiqiang probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT zhangyanbo probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT yuhongmei probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT zhengchuchu probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT huangxueqian probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT yangzhenhuan probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT xingmeng probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT luqing probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma
AT luoyanhong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma