Cargando…
Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis
BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844233/ https://www.ncbi.nlm.nih.gov/pubmed/35198928 http://dx.doi.org/10.1016/j.jhepr.2022.100441 |
_version_ | 1784651432834629632 |
---|---|
author | Wong, Grace Lai-Hung Hui, Vicki Wing-Ki Tan, Qingxiong Xu, Jingwen Lee, Hye Won Yip, Terry Cheuk-Fung Yang, Baoyao Tse, Yee-Kit Yin, Chong Lyu, Fei Lai, Jimmy Che-To Lui, Grace Chung-Yan Chan, Henry Lik-Yuen Yuen, Pong-Chi Wong, Vincent Wai-Sun |
author_facet | Wong, Grace Lai-Hung Hui, Vicki Wing-Ki Tan, Qingxiong Xu, Jingwen Lee, Hye Won Yip, Terry Cheuk-Fung Yang, Baoyao Tse, Yee-Kit Yin, Chong Lyu, Fei Lai, Jimmy Che-To Lui, Grace Chung-Yan Chan, Henry Lik-Yuen Yuen, Pong-Chi Wong, Vincent Wai-Sun |
author_sort | Wong, Grace Lai-Hung |
collection | PubMed |
description | BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis (CVH) using data from the Hospital Authority Data Collaboration Lab (HADCL). METHODS: This was a territory-wide, retrospective, observational, cohort study of patients with CVH in Hong Kong in 2000–2018 identified from HADCL based on viral markers, diagnosis codes, and antiviral treatment for chronic hepatitis B and/or C. The cohort was randomly split into training and validation cohorts in a 7:3 ratio. Five popular machine learning methods, namely, logistic regression, ridge regression, AdaBoost, decision tree, and random forest, were performed and compared to find the best prediction model. RESULTS: A total of 124,006 patients with CVH with complete data were included to build the models. In the training cohort (n = 86,804; 6,821 HCC), ridge regression (area under the receiver operating characteristic curve [AUROC] 0.842), decision tree (0.952), and random forest (0.992) performed the best. In the validation cohort (n = 37,202; 2,875 HCC), ridge regression (AUROC 0.844) and random forest (0.837) maintained their accuracy, which was significantly higher than those of HCC risk scores: CU-HCC (0.672), GAG-HCC (0.745), REACH-B (0.671), PAGE-B (0.748), and REAL-B (0.712) scores. The low cut-off (0.07) of HCC ridge score (HCC-RS) achieved 90.0% sensitivity and 98.6% negative predictive value (NPV) in the validation cohort. The high cut-off (0.15) of HCC-RS achieved high specificity (90.0%) and NPV (95.6%); 31.1% of patients remained indeterminate. CONCLUSIONS: HCC-RS from the ridge regression machine learning model accurately predicted HCC in patients with CVH. These machine learning models may be developed as built-in functional keys or calculators in electronic health systems to reduce cancer mortality. LAY SUMMARY: Novel machine learning models generated accurate risk scores for hepatocellular carcinoma (HCC) in patients with chronic viral hepatitis. HCC ridge score was consistently more accurate than existing HCC risk scores. These models may be incorporated into electronic medical health systems to develop appropriate cancer surveillance strategies and reduce cancer death. |
format | Online Article Text |
id | pubmed-8844233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-88442332022-02-22 Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis Wong, Grace Lai-Hung Hui, Vicki Wing-Ki Tan, Qingxiong Xu, Jingwen Lee, Hye Won Yip, Terry Cheuk-Fung Yang, Baoyao Tse, Yee-Kit Yin, Chong Lyu, Fei Lai, Jimmy Che-To Lui, Grace Chung-Yan Chan, Henry Lik-Yuen Yuen, Pong-Chi Wong, Vincent Wai-Sun JHEP Rep Research Article BACKGROUND & AIMS: Accurate hepatocellular carcinoma (HCC) risk prediction facilitates appropriate surveillance strategy and reduces cancer mortality. We aimed to derive and validate novel machine learning models to predict HCC in a territory-wide cohort of patients with chronic viral hepatitis (CVH) using data from the Hospital Authority Data Collaboration Lab (HADCL). METHODS: This was a territory-wide, retrospective, observational, cohort study of patients with CVH in Hong Kong in 2000–2018 identified from HADCL based on viral markers, diagnosis codes, and antiviral treatment for chronic hepatitis B and/or C. The cohort was randomly split into training and validation cohorts in a 7:3 ratio. Five popular machine learning methods, namely, logistic regression, ridge regression, AdaBoost, decision tree, and random forest, were performed and compared to find the best prediction model. RESULTS: A total of 124,006 patients with CVH with complete data were included to build the models. In the training cohort (n = 86,804; 6,821 HCC), ridge regression (area under the receiver operating characteristic curve [AUROC] 0.842), decision tree (0.952), and random forest (0.992) performed the best. In the validation cohort (n = 37,202; 2,875 HCC), ridge regression (AUROC 0.844) and random forest (0.837) maintained their accuracy, which was significantly higher than those of HCC risk scores: CU-HCC (0.672), GAG-HCC (0.745), REACH-B (0.671), PAGE-B (0.748), and REAL-B (0.712) scores. The low cut-off (0.07) of HCC ridge score (HCC-RS) achieved 90.0% sensitivity and 98.6% negative predictive value (NPV) in the validation cohort. The high cut-off (0.15) of HCC-RS achieved high specificity (90.0%) and NPV (95.6%); 31.1% of patients remained indeterminate. CONCLUSIONS: HCC-RS from the ridge regression machine learning model accurately predicted HCC in patients with CVH. These machine learning models may be developed as built-in functional keys or calculators in electronic health systems to reduce cancer mortality. LAY SUMMARY: Novel machine learning models generated accurate risk scores for hepatocellular carcinoma (HCC) in patients with chronic viral hepatitis. HCC ridge score was consistently more accurate than existing HCC risk scores. These models may be incorporated into electronic medical health systems to develop appropriate cancer surveillance strategies and reduce cancer death. Elsevier 2022-01-22 /pmc/articles/PMC8844233/ /pubmed/35198928 http://dx.doi.org/10.1016/j.jhepr.2022.100441 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Wong, Grace Lai-Hung Hui, Vicki Wing-Ki Tan, Qingxiong Xu, Jingwen Lee, Hye Won Yip, Terry Cheuk-Fung Yang, Baoyao Tse, Yee-Kit Yin, Chong Lyu, Fei Lai, Jimmy Che-To Lui, Grace Chung-Yan Chan, Henry Lik-Yuen Yuen, Pong-Chi Wong, Vincent Wai-Sun Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title | Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title_full | Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title_fullStr | Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title_full_unstemmed | Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title_short | Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
title_sort | novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844233/ https://www.ncbi.nlm.nih.gov/pubmed/35198928 http://dx.doi.org/10.1016/j.jhepr.2022.100441 |
work_keys_str_mv | AT wonggracelaihung novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT huivickiwingki novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT tanqingxiong novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT xujingwen novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT leehyewon novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT yipterrycheukfung novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT yangbaoyao novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT tseyeekit novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT yinchong novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT lyufei novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT laijimmycheto novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT luigracechungyan novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT chanhenrylikyuen novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT yuenpongchi novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis AT wongvincentwaisun novelmachinelearningmodelsoutperformriskscoresinpredictinghepatocellularcarcinomainpatientswithchronicviralhepatitis |