Cargando…

Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis

BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis...

Descripción completa

Detalles Bibliográficos
Autores principales: Wong, Grace Lai-Hung, Hui, Vicki Wing-Ki, Tan, Qingxiong, Xu, Jingwen, Lee, Hye Won, Yip, Terry Cheuk-Fung, Yang, Baoyao, Tse, Yee-Kit, Yin, Chong, Lyu, Fei, Lai, Jimmy Che-To, Lui, Grace Chung-Yan, Chan, Henry Lik-Yuen, Yuen, Pong-Chi, Wong, Vincent Wai-Sun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844233/
https://www.ncbi.nlm.nih.gov/pubmed/35198928
http://dx.doi.org/10.1016/j.jhepr.2022.100441
_version_ 1784651432834629632
author Wong, Grace Lai-Hung
Hui, Vicki Wing-Ki
Tan, Qingxiong
Xu, Jingwen
Lee, Hye Won
Yip, Terry Cheuk-Fung
Yang, Baoyao
Tse, Yee-Kit
Yin, Chong
Lyu, Fei
Lai, Jimmy Che-To
Lui, Grace Chung-Yan
Chan, Henry Lik-Yuen
Yuen, Pong-Chi
Wong, Vincent Wai-Sun
author_facet Wong, Grace Lai-Hung
Hui, Vicki Wing-Ki
Tan, Qingxiong
Xu, Jingwen
Lee, Hye Won
Yip, Terry Cheuk-Fung
Yang, Baoyao
Tse, Yee-Kit
Yin, Chong
Lyu, Fei
Lai, Jimmy Che-To
Lui, Grace Chung-Yan
Chan, Henry Lik-Yuen
Yuen, Pong-Chi
Wong, Vincent Wai-Sun
author_sort Wong, Grace Lai-Hung
collection PubMed
description BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis (CVH) using data from the Hospital Authority Data Collaboration Lab (HADCL). METHODS: This was a territory-wide, retrospective, observational, cohort study of patients with CVH in Hong Kong in 2000–2018 identified from HADCL based on viral markers, diagnosis codes, and antiviral treatment for chronic hepatitis B and/or C. The cohort was randomly split into training and validation cohorts in a 7:3 ratio. Five popular machine learning methods, namely, logistic regression, ridge regression, AdaBoost, decision tree, and random forest, were performed and compared to find the best prediction model. RESULTS: A total of 124,006 patients with CVH with complete data were included to build the models. In the training cohort (n = 86,804; 6,821 HCC), ridge regression (area under the receiver operating characteristic curve [AUROC] 0.842), decision tree (0.952), and random forest (0.992) performed the best. In the validation cohort (n = 37,202; 2,875 HCC), ridge regression (AUROC 0.844) and random forest (0.837) maintained their accuracy, which was significantly higher than those of HCC risk scores: CU-HCC (0.672), GAG-HCC (0.745), REACH-B (0.671), PAGE-B (0.748), and REAL-B (0.712) scores. The low cut-off (0.07) of HCC ridge score (HCC-RS) achieved 90.0% sensitivity and 98.6% negative predictive value (NPV) in the validation cohort. The high cut-off (0.15) of HCC-RS achieved high specificity (90.0%) and NPV (95.6%); 31.1% of patients remained indeterminate. CONCLUSIONS: HCC-RS from the ridge regression machine learning model accurately predicted HCC in patients with CVH. These machine learning models may be developed as built-in functional keys or calculators in electronic health systems to reduce cancer mortality. LAY SUMMARY: Novel machine learning models generated accurate risk scores for hepatocellular carcinoma (HCC) in patients with chronic viral hepatitis. HCC ridge score was consistently more accurate than existing HCC risk scores. These models may be incorporated into electronic medical health systems to develop appropriate cancer surveillance strategies and reduce cancer death.
format Online
Article
Text
id pubmed-8844233
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-88442332022-02-22 Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis Wong, Grace Lai-Hung Hui, Vicki Wing-Ki Tan, Qingxiong Xu, Jingwen Lee, Hye Won Yip, Terry Cheuk-Fung Yang, Baoyao Tse, Yee-Kit Yin, Chong Lyu, Fei Lai, Jimmy Che-To Lui, Grace Chung-Yan Chan, Henry Lik-Yuen Yuen, Pong-Chi Wong, Vincent Wai-Sun JHEP Rep Research Article BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis (CVH) using data from the Hospital Authority Data Collaboration Lab (HADCL). METHODS: This was a territory-wide, retrospective, observational, cohort study of patients with CVH in Hong Kong in 2000–2018 identified from HADCL based on viral markers, diagnosis codes, and antiviral treatment for chronic hepatitis B and/or C. The cohort was randomly split into training and validation cohorts in a 7:3 ratio. Five popular machine learning methods, namely, logistic regression, ridge regression, AdaBoost, decision tree, and random forest, were performed and compared to find the best prediction model. RESULTS: A total of 124,006 patients with CVH with complete data were included to build the models. In the training cohort (n = 86,804; 6,821 HCC), ridge regression (area under the receiver operating characteristic curve [AUROC] 0.842), decision tree (0.952), and random forest (0.992) performed the best. In the validation cohort (n = 37,202; 2,875 HCC), ridge regression (AUROC 0.844) and random forest (0.837) maintained their accuracy, which was significantly higher than those of HCC risk scores: CU-HCC (0.672), GAG-HCC (0.745), REACH-B (0.671), PAGE-B (0.748), and REAL-B (0.712) scores. The low cut-off (0.07) of HCC ridge score (HCC-RS) achieved 90.0% sensitivity and 98.6% negative predictive value (NPV) in the validation cohort. The high cut-off (0.15) of HCC-RS achieved high specificity (90.0%) and NPV (95.6%); 31.1% of patients remained indeterminate. CONCLUSIONS: HCC-RS from the ridge regression machine learning model accurately predicted HCC in patients with CVH. These machine learning models may be developed as built-in functional keys or calculators in electronic health systems to reduce cancer mortality. LAY SUMMARY: Novel machine learning models generated accurate risk scores for hepatocellular carcinoma (HCC) in patients with chronic viral hepatitis. HCC ridge score was consistently more accurate than existing HCC risk scores. These models may be incorporated into electronic medical health systems to develop appropriate cancer surveillance strategies and reduce cancer death. Elsevier 2022-01-22 /pmc/articles/PMC8844233/ /pubmed/35198928 http://dx.doi.org/10.1016/j.jhepr.2022.100441 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Wong, Grace Lai-Hung
Hui, Vicki Wing-Ki
Tan, Qingxiong
Xu, Jingwen
Lee, Hye Won
Yip, Terry Cheuk-Fung
Yang, Baoyao
Tse, Yee-Kit
Yin, Chong
Lyu, Fei
Lai, Jimmy Che-To
Lui, Grace Chung-Yan
Chan, Henry Lik-Yuen
Yuen, Pong-Chi
Wong, Vincent Wai-Sun
Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title_full Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title_fullStr Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title_full_unstemmed Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title_short Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
title_sort novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844233/
https://www.ncbi.nlm.nih.gov/pubmed/35198928
http://dx.doi.org/10.1016/j.jhepr.2022.100441
work_keys_str_mv AT wonggracelaihung novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT huivickiwingki novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT tanqingxiong novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT xujingwen novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT leehyewon novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT yipterrycheukfung novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT yangbaoyao novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT tseyeekit novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT yinchong novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT lyufei novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT laijimmycheto novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT luigracechungyan novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT chanhenrylikyuen novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT yuenpongchi novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis
AT wongvincentwaisun novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis