Cargando…

Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis

BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior st...

Descripción completa

Detalles Bibliográficos
Autores principales: Agarwal, Vibhu, Zhang, Liangliang, Zhu, Josh, Fang, Shiyuan, Cheng, Tim, Hong, Chloe, Shah, Nigam H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5052461/
https://www.ncbi.nlm.nih.gov/pubmed/27655225
http://dx.doi.org/10.2196/jmir.6240
_version_ 1782458235644542976
author Agarwal, Vibhu
Zhang, Liangliang
Zhu, Josh
Fang, Shiyuan
Cheng, Tim
Hong, Chloe
Shah, Nigam H
author_facet Agarwal, Vibhu
Zhang, Liangliang
Zhu, Josh
Fang, Shiyuan
Cheng, Tim
Hong, Chloe
Shah, Nigam H
author_sort Agarwal, Vibhu
collection PubMed
description BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior studies suggest the feasibility of mining population-level patterns of health care resource utilization from observational analysis of Internet search logs; however, the utility of the endeavor to the various stakeholders in a health ecosystem remains unclear. OBJECTIVE: The aim was to carry out a closed-loop evaluation of the utility of health care use predictions using the conversion rates of advertisements that were displayed to the predicted future utilizers as a surrogate. The statistical models to predict the probability of user’s future visit to a medical facility were built using effective predictors of health care resource utilization, extracted from a deidentified dataset of geotagged mobile Internet search logs representing searches made by users of the Baidu search engine between March 2015 and May 2015. METHODS: We inferred presence within the geofence of a medical facility from location and duration information from users’ search logs and putatively assigned medical facility visit labels to qualifying search logs. We constructed a matrix of general, semantic, and location-based features from search logs of users that had 42 or more search days preceding a medical facility visit as well as from search logs of users that had no medical visits and trained statistical learners for predicting future medical visits. We then carried out a closed-loop evaluation of the utility of health care use predictions using the show conversion rates of advertisements displayed to the predicted future utilizers. In the context of behaviorally targeted advertising, wherein health care providers are interested in minimizing their cost per conversion, the association between show conversion rate and predicted utilization score, served as a surrogate measure of the model’s utility. RESULTS: We obtained the highest area under the curve (0.796) in medical visit prediction with our random forests model and daywise features. Ablating feature categories one at a time showed that the model performance worsened the most when location features were dropped. An online evaluation in which advertisements were served to users who had a high predicted probability of a future medical visit showed a 3.96% increase in the show conversion rate. CONCLUSIONS: Results from our experiments done in a research setting suggest that it is possible to accurately predict future patient visits from geotagged mobile search logs. Results from the offline and online experiments on the utility of health utilization predictions suggest that such prediction can have utility for health care providers.
format Online
Article
Text
id pubmed-5052461
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-50524612016-10-20 Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis Agarwal, Vibhu Zhang, Liangliang Zhu, Josh Fang, Shiyuan Cheng, Tim Hong, Chloe Shah, Nigam H J Med Internet Res Original Paper BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior studies suggest the feasibility of mining population-level patterns of health care resource utilization from observational analysis of Internet search logs; however, the utility of the endeavor to the various stakeholders in a health ecosystem remains unclear. OBJECTIVE: The aim was to carry out a closed-loop evaluation of the utility of health care use predictions using the conversion rates of advertisements that were displayed to the predicted future utilizers as a surrogate. The statistical models to predict the probability of user’s future visit to a medical facility were built using effective predictors of health care resource utilization, extracted from a deidentified dataset of geotagged mobile Internet search logs representing searches made by users of the Baidu search engine between March 2015 and May 2015. METHODS: We inferred presence within the geofence of a medical facility from location and duration information from users’ search logs and putatively assigned medical facility visit labels to qualifying search logs. We constructed a matrix of general, semantic, and location-based features from search logs of users that had 42 or more search days preceding a medical facility visit as well as from search logs of users that had no medical visits and trained statistical learners for predicting future medical visits. We then carried out a closed-loop evaluation of the utility of health care use predictions using the show conversion rates of advertisements displayed to the predicted future utilizers. In the context of behaviorally targeted advertising, wherein health care providers are interested in minimizing their cost per conversion, the association between show conversion rate and predicted utilization score, served as a surrogate measure of the model’s utility. RESULTS: We obtained the highest area under the curve (0.796) in medical visit prediction with our random forests model and daywise features. Ablating feature categories one at a time showed that the model performance worsened the most when location features were dropped. An online evaluation in which advertisements were served to users who had a high predicted probability of a future medical visit showed a 3.96% increase in the show conversion rate. CONCLUSIONS: Results from our experiments done in a research setting suggest that it is possible to accurately predict future patient visits from geotagged mobile search logs. Results from the offline and online experiments on the utility of health utilization predictions suggest that such prediction can have utility for health care providers. JMIR Publications 2016-09-21 /pmc/articles/PMC5052461/ /pubmed/27655225 http://dx.doi.org/10.2196/jmir.6240 Text en ©Vibhu Agarwal, Liangliang Zhang, Josh Zhu, Shiyuan Fang, Tim Cheng, Chloe Hong, Nigam H Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 21.09.2016. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Agarwal, Vibhu
Zhang, Liangliang
Zhu, Josh
Fang, Shiyuan
Cheng, Tim
Hong, Chloe
Shah, Nigam H
Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title_full Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title_fullStr Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title_full_unstemmed Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title_short Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
title_sort impact of predicting health care utilization via web search behavior: a data-driven analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5052461/
https://www.ncbi.nlm.nih.gov/pubmed/27655225
http://dx.doi.org/10.2196/jmir.6240
work_keys_str_mv AT agarwalvibhu impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT zhangliangliang impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT zhujosh impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT fangshiyuan impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT chengtim impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT hongchloe impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis
AT shahnigamh impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis