Cargando…

Machine learning models to predict disease progression among veterans with hepatitis C virus

BACKGROUND: Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatiti...

Descripción completa

Detalles Bibliográficos
Autores principales: Konerman, Monica A., Beste, Lauren A., Van, Tony, Liu, Boang, Zhang, Xuefei, Zhu, Ji, Saini, Sameer D., Su, Grace L., Nallamothu, Brahmajee K., Ioannou, George N., Waljee, Akbar K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6319806/
https://www.ncbi.nlm.nih.gov/pubmed/30608929
http://dx.doi.org/10.1371/journal.pone.0208141
_version_ 1783385130078830592
author Konerman, Monica A.
Beste, Lauren A.
Van, Tony
Liu, Boang
Zhang, Xuefei
Zhu, Ji
Saini, Sameer D.
Su, Grace L.
Nallamothu, Brahmajee K.
Ioannou, George N.
Waljee, Akbar K.
author_facet Konerman, Monica A.
Beste, Lauren A.
Van, Tony
Liu, Boang
Zhang, Xuefei
Zhu, Ji
Saini, Sameer D.
Su, Grace L.
Nallamothu, Brahmajee K.
Ioannou, George N.
Waljee, Akbar K.
author_sort Konerman, Monica A.
collection PubMed
description BACKGROUND: Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data. METHODS AND FINDINGS: We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000–2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC. CONCLUSIONS: Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states.
format Online
Article
Text
id pubmed-6319806
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63198062019-01-19 Machine learning models to predict disease progression among veterans with hepatitis C virus Konerman, Monica A. Beste, Lauren A. Van, Tony Liu, Boang Zhang, Xuefei Zhu, Ji Saini, Sameer D. Su, Grace L. Nallamothu, Brahmajee K. Ioannou, George N. Waljee, Akbar K. PLoS One Research Article BACKGROUND: Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data. METHODS AND FINDINGS: We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000–2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC. CONCLUSIONS: Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states. Public Library of Science 2019-01-04 /pmc/articles/PMC6319806/ /pubmed/30608929 http://dx.doi.org/10.1371/journal.pone.0208141 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Konerman, Monica A.
Beste, Lauren A.
Van, Tony
Liu, Boang
Zhang, Xuefei
Zhu, Ji
Saini, Sameer D.
Su, Grace L.
Nallamothu, Brahmajee K.
Ioannou, George N.
Waljee, Akbar K.
Machine learning models to predict disease progression among veterans with hepatitis C virus
title Machine learning models to predict disease progression among veterans with hepatitis C virus
title_full Machine learning models to predict disease progression among veterans with hepatitis C virus
title_fullStr Machine learning models to predict disease progression among veterans with hepatitis C virus
title_full_unstemmed Machine learning models to predict disease progression among veterans with hepatitis C virus
title_short Machine learning models to predict disease progression among veterans with hepatitis C virus
title_sort machine learning models to predict disease progression among veterans with hepatitis c virus
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6319806/
https://www.ncbi.nlm.nih.gov/pubmed/30608929
http://dx.doi.org/10.1371/journal.pone.0208141
work_keys_str_mv AT konermanmonicaa machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT bestelaurena machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT vantony machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT liuboang machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT zhangxuefei machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT zhuji machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT sainisameerd machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT sugracel machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT nallamothubrahmajeek machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT ioannougeorgen machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus
AT waljeeakbark machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus