Cargando…

Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma

BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fan, Shuanglong, Zhao, Zhiqiang, Zhang, Yanbo, Yu, Hongmei, Zheng, Chuchu, Huang, Xueqian, Yang, Zhenhuan, Xing, Meng, Lu, Qing, Luo, Yanhong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/ https://www.ncbi.nlm.nih.gov/pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9

_version_	1783738104916475904
author	Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong
author_facet	Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong
author_sort	Fan, Shuanglong
collection	PubMed
description	BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine.
format	Online Article Text
id	pubmed-8362168
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-83621682021-08-17 Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong BioData Min Research BACKGROUND: Although many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL. METHODS: Five machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS. RESULTS: Sex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models. CONCLUSIONS: Although these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine. BioMed Central 2021-08-13 /pmc/articles/PMC8362168/ /pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Fan, Shuanglong Zhao, Zhiqiang Zhang, Yanbo Yu, Hongmei Zheng, Chuchu Huang, Xueqian Yang, Zhenhuan Xing, Meng Lu, Qing Luo, Yanhong Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title	Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_full	Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_fullStr	Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_full_unstemmed	Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_short	Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma
title_sort	probability calibration-based prediction of recurrence rate in patients with diffuse large b-cell lymphoma
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8362168/ https://www.ncbi.nlm.nih.gov/pubmed/34389029 http://dx.doi.org/10.1186/s13040-021-00272-9
work_keys_str_mv	AT fanshuanglong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhaozhiqiang probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhangyanbo probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT yuhongmei probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT zhengchuchu probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT huangxueqian probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT yangzhenhuan probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT xingmeng probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT luqing probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma AT luoyanhong probabilitycalibrationbasedpredictionofrecurrencerateinpatientswithdiffuselargebcelllymphoma

Probability calibration-based prediction of recurrence rate in patients with diffuse large B-cell lymphoma

Ejemplares similares