Cargando…

Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data

BACKGROUND: A high proportion of health care services are persistently utilized by a small subpopulation of patients. To improve clinical outcomes while reducing costs and utilization, population health management programs often provide targeted interventions to patients who may become persistent hi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ramachandran, Raghav, McShea, Michael J, Howson, Stephanie N, Burkom, Howard S, Chang, Hsien-Yen, Weiner, Jonathan P, Kharrazi, Hadi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2021
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8663459/ https://www.ncbi.nlm.nih.gov/pubmed/34592712 http://dx.doi.org/10.2196/31442

_version_	1784613641720430592
author	Ramachandran, Raghav McShea, Michael J Howson, Stephanie N Burkom, Howard S Chang, Hsien-Yen Weiner, Jonathan P Kharrazi, Hadi
author_facet	Ramachandran, Raghav McShea, Michael J Howson, Stephanie N Burkom, Howard S Chang, Hsien-Yen Weiner, Jonathan P Kharrazi, Hadi
author_sort	Ramachandran, Raghav
collection	PubMed
description	BACKGROUND: A high proportion of health care services are persistently utilized by a small subpopulation of patients. To improve clinical outcomes while reducing costs and utilization, population health management programs often provide targeted interventions to patients who may become persistent high users/utilizers (PHUs). Enhanced prediction and management of PHUs can improve health care system efficiencies and improve the overall quality of patient care. OBJECTIVE: The aim of this study was to detect key classes of diseases and medications among the study population and to assess the predictive value of these classes in identifying PHUs. METHODS: This study was a retrospective analysis of insurance claims data of patients from the Johns Hopkins Health Care system. We defined a PHU as a patient incurring health care costs in the top 20% of all patients’ costs for 4 consecutive 6-month periods. We used 2013 claims data to predict PHU status in 2014-2015. We applied latent class analysis (LCA), an unsupervised clustering approach, to identify patient subgroups with similar diagnostic and medication patterns to differentiate variations in health care utilization across PHUs. Logistic regression models were then built to predict PHUs in the full population and in select subpopulations. Predictors included LCA membership probabilities, demographic covariates, and health utilization covariates. Predictive powers of the regression models were assessed and compared using standard metrics. RESULTS: We identified 164,221 patients with continuous enrollment between 2013 and 2015. The mean study population age was 19.7 years, 55.9% were women, 3.3% had ≥1 hospitalization, and 19.1% had 10+ outpatient visits in 2013. A total of 8359 (5.09%) patients were identified as PHUs in both 2014 and 2015. The LCA performed optimally when assigning patients to four probability disease/medication classes. Given the feedback provided by clinical experts, we further divided the population into four diagnostic groups for sensitivity analysis: acute upper respiratory infection (URI) (n=53,232; 4.6% PHUs), mental health (n=34,456; 12.8% PHUs), otitis media (n=24,992; 4.5% PHUs), and musculoskeletal (n=24,799; 15.5% PHUs). For the regression models predicting PHUs in the full population, the F1-score classification metric was lower using a parsimonious model that included LCA categories (F1=38.62%) compared to that of a complex risk stratification model with a full set of predictors (F1=48.20%). However, the LCA-enabled simple models were comparable to the complex model when predicting PHUs in the mental health and musculoskeletal subpopulations (F1-scores of 48.69% and 48.15%, respectively). F1-scores were lower than that of the complex model when the LCA-enabled models were limited to the otitis media and acute URI subpopulations (45.77% and 43.05%, respectively). CONCLUSIONS: Our study illustrates the value of LCA in identifying subgroups of patients with similar patterns of diagnoses and medications. Our results show that LCA-derived classes can simplify predictive models of PHUs without compromising predictive accuracy. Future studies should investigate the value of LCA-derived classes for predicting PHUs in other health care settings.
format	Online Article Text
id	pubmed-8663459
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-86634592022-01-05 Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data Ramachandran, Raghav McShea, Michael J Howson, Stephanie N Burkom, Howard S Chang, Hsien-Yen Weiner, Jonathan P Kharrazi, Hadi JMIR Med Inform Original Paper BACKGROUND: A high proportion of health care services are persistently utilized by a small subpopulation of patients. To improve clinical outcomes while reducing costs and utilization, population health management programs often provide targeted interventions to patients who may become persistent high users/utilizers (PHUs). Enhanced prediction and management of PHUs can improve health care system efficiencies and improve the overall quality of patient care. OBJECTIVE: The aim of this study was to detect key classes of diseases and medications among the study population and to assess the predictive value of these classes in identifying PHUs. METHODS: This study was a retrospective analysis of insurance claims data of patients from the Johns Hopkins Health Care system. We defined a PHU as a patient incurring health care costs in the top 20% of all patients’ costs for 4 consecutive 6-month periods. We used 2013 claims data to predict PHU status in 2014-2015. We applied latent class analysis (LCA), an unsupervised clustering approach, to identify patient subgroups with similar diagnostic and medication patterns to differentiate variations in health care utilization across PHUs. Logistic regression models were then built to predict PHUs in the full population and in select subpopulations. Predictors included LCA membership probabilities, demographic covariates, and health utilization covariates. Predictive powers of the regression models were assessed and compared using standard metrics. RESULTS: We identified 164,221 patients with continuous enrollment between 2013 and 2015. The mean study population age was 19.7 years, 55.9% were women, 3.3% had ≥1 hospitalization, and 19.1% had 10+ outpatient visits in 2013. A total of 8359 (5.09%) patients were identified as PHUs in both 2014 and 2015. The LCA performed optimally when assigning patients to four probability disease/medication classes. Given the feedback provided by clinical experts, we further divided the population into four diagnostic groups for sensitivity analysis: acute upper respiratory infection (URI) (n=53,232; 4.6% PHUs), mental health (n=34,456; 12.8% PHUs), otitis media (n=24,992; 4.5% PHUs), and musculoskeletal (n=24,799; 15.5% PHUs). For the regression models predicting PHUs in the full population, the F1-score classification metric was lower using a parsimonious model that included LCA categories (F1=38.62%) compared to that of a complex risk stratification model with a full set of predictors (F1=48.20%). However, the LCA-enabled simple models were comparable to the complex model when predicting PHUs in the mental health and musculoskeletal subpopulations (F1-scores of 48.69% and 48.15%, respectively). F1-scores were lower than that of the complex model when the LCA-enabled models were limited to the otitis media and acute URI subpopulations (45.77% and 43.05%, respectively). CONCLUSIONS: Our study illustrates the value of LCA in identifying subgroups of patients with similar patterns of diagnoses and medications. Our results show that LCA-derived classes can simplify predictive models of PHUs without compromising predictive accuracy. Future studies should investigate the value of LCA-derived classes for predicting PHUs in other health care settings. JMIR Publications 2021-11-25 /pmc/articles/PMC8663459/ /pubmed/34592712 http://dx.doi.org/10.2196/31442 Text en ©Raghav Ramachandran, Michael J McShea, Stephanie N Howson, Howard S Burkom, Hsien-Yen Chang, Jonathan P Weiner, Hadi Kharrazi. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 25.11.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Ramachandran, Raghav McShea, Michael J Howson, Stephanie N Burkom, Howard S Chang, Hsien-Yen Weiner, Jonathan P Kharrazi, Hadi Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title	Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title_full	Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title_fullStr	Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title_full_unstemmed	Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title_short	Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data
title_sort	assessing the value of unsupervised clustering in predicting persistent high health care utilizers: retrospective analysis of insurance claims data
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8663459/ https://www.ncbi.nlm.nih.gov/pubmed/34592712 http://dx.doi.org/10.2196/31442
work_keys_str_mv	AT ramachandranraghav assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT mcsheamichaelj assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT howsonstephanien assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT burkomhowards assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT changhsienyen assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT weinerjonathanp assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata AT kharrazihadi assessingthevalueofunsupervisedclusteringinpredictingpersistenthighhealthcareutilizersretrospectiveanalysisofinsuranceclaimsdata

Assessing the Value of Unsupervised Clustering in Predicting Persistent High Health Care Utilizers: Retrospective Analysis of Insurance Claims Data

Ejemplares similares