Cargando…
Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis
BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior st...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5052461/ https://www.ncbi.nlm.nih.gov/pubmed/27655225 http://dx.doi.org/10.2196/jmir.6240 |
_version_ | 1782458235644542976 |
---|---|
author | Agarwal, Vibhu Zhang, Liangliang Zhu, Josh Fang, Shiyuan Cheng, Tim Hong, Chloe Shah, Nigam H |
author_facet | Agarwal, Vibhu Zhang, Liangliang Zhu, Josh Fang, Shiyuan Cheng, Tim Hong, Chloe Shah, Nigam H |
author_sort | Agarwal, Vibhu |
collection | PubMed |
description | BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior studies suggest the feasibility of mining population-level patterns of health care resource utilization from observational analysis of Internet search logs; however, the utility of the endeavor to the various stakeholders in a health ecosystem remains unclear. OBJECTIVE: The aim was to carry out a closed-loop evaluation of the utility of health care use predictions using the conversion rates of advertisements that were displayed to the predicted future utilizers as a surrogate. The statistical models to predict the probability of user’s future visit to a medical facility were built using effective predictors of health care resource utilization, extracted from a deidentified dataset of geotagged mobile Internet search logs representing searches made by users of the Baidu search engine between March 2015 and May 2015. METHODS: We inferred presence within the geofence of a medical facility from location and duration information from users’ search logs and putatively assigned medical facility visit labels to qualifying search logs. We constructed a matrix of general, semantic, and location-based features from search logs of users that had 42 or more search days preceding a medical facility visit as well as from search logs of users that had no medical visits and trained statistical learners for predicting future medical visits. We then carried out a closed-loop evaluation of the utility of health care use predictions using the show conversion rates of advertisements displayed to the predicted future utilizers. In the context of behaviorally targeted advertising, wherein health care providers are interested in minimizing their cost per conversion, the association between show conversion rate and predicted utilization score, served as a surrogate measure of the model’s utility. RESULTS: We obtained the highest area under the curve (0.796) in medical visit prediction with our random forests model and daywise features. Ablating feature categories one at a time showed that the model performance worsened the most when location features were dropped. An online evaluation in which advertisements were served to users who had a high predicted probability of a future medical visit showed a 3.96% increase in the show conversion rate. CONCLUSIONS: Results from our experiments done in a research setting suggest that it is possible to accurately predict future patient visits from geotagged mobile search logs. Results from the offline and online experiments on the utility of health utilization predictions suggest that such prediction can have utility for health care providers. |
format | Online Article Text |
id | pubmed-5052461 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-50524612016-10-20 Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis Agarwal, Vibhu Zhang, Liangliang Zhu, Josh Fang, Shiyuan Cheng, Tim Hong, Chloe Shah, Nigam H J Med Internet Res Original Paper BACKGROUND: By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior studies suggest the feasibility of mining population-level patterns of health care resource utilization from observational analysis of Internet search logs; however, the utility of the endeavor to the various stakeholders in a health ecosystem remains unclear. OBJECTIVE: The aim was to carry out a closed-loop evaluation of the utility of health care use predictions using the conversion rates of advertisements that were displayed to the predicted future utilizers as a surrogate. The statistical models to predict the probability of user’s future visit to a medical facility were built using effective predictors of health care resource utilization, extracted from a deidentified dataset of geotagged mobile Internet search logs representing searches made by users of the Baidu search engine between March 2015 and May 2015. METHODS: We inferred presence within the geofence of a medical facility from location and duration information from users’ search logs and putatively assigned medical facility visit labels to qualifying search logs. We constructed a matrix of general, semantic, and location-based features from search logs of users that had 42 or more search days preceding a medical facility visit as well as from search logs of users that had no medical visits and trained statistical learners for predicting future medical visits. We then carried out a closed-loop evaluation of the utility of health care use predictions using the show conversion rates of advertisements displayed to the predicted future utilizers. In the context of behaviorally targeted advertising, wherein health care providers are interested in minimizing their cost per conversion, the association between show conversion rate and predicted utilization score, served as a surrogate measure of the model’s utility. RESULTS: We obtained the highest area under the curve (0.796) in medical visit prediction with our random forests model and daywise features. Ablating feature categories one at a time showed that the model performance worsened the most when location features were dropped. An online evaluation in which advertisements were served to users who had a high predicted probability of a future medical visit showed a 3.96% increase in the show conversion rate. CONCLUSIONS: Results from our experiments done in a research setting suggest that it is possible to accurately predict future patient visits from geotagged mobile search logs. Results from the offline and online experiments on the utility of health utilization predictions suggest that such prediction can have utility for health care providers. JMIR Publications 2016-09-21 /pmc/articles/PMC5052461/ /pubmed/27655225 http://dx.doi.org/10.2196/jmir.6240 Text en ©Vibhu Agarwal, Liangliang Zhang, Josh Zhu, Shiyuan Fang, Tim Cheng, Chloe Hong, Nigam H Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 21.09.2016. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Agarwal, Vibhu Zhang, Liangliang Zhu, Josh Fang, Shiyuan Cheng, Tim Hong, Chloe Shah, Nigam H Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title | Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title_full | Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title_fullStr | Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title_full_unstemmed | Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title_short | Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis |
title_sort | impact of predicting health care utilization via web search behavior: a data-driven analysis |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5052461/ https://www.ncbi.nlm.nih.gov/pubmed/27655225 http://dx.doi.org/10.2196/jmir.6240 |
work_keys_str_mv | AT agarwalvibhu impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT zhangliangliang impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT zhujosh impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT fangshiyuan impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT chengtim impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT hongchloe impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis AT shahnigamh impactofpredictinghealthcareutilizationviawebsearchbehavioradatadrivenanalysis |