Cargando…

HIV Risk Assessment using Longitudinal Electronic Health Records

BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency...

Descripción completa

Detalles Bibliográficos
Autores principales: Feller, Daniel, Zucker, Jason, Yin, Michael, Gordon, Peter, Elhadad, Noemie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5631278/
http://dx.doi.org/10.1093/ofid/ofx163.1049
_version_ 1783269428635369472
author Feller, Daniel
Zucker, Jason
Yin, Michael
Gordon, Peter
Elhadad, Noemie
author_facet Feller, Daniel
Zucker, Jason
Yin, Michael
Gordon, Peter
Elhadad, Noemie
author_sort Feller, Daniel
collection PubMed
description BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency Departments and across public health jurisdictions. While information on social and behavioral determinants of health are typically collected in unstructured fields, previous analyses have only considered structured EHR data. We sought to characterize whether clinical notes can improve predictive models of HIV diagnosis. METHODS: 181 individuals who received care at an academic medical center in New York City prior to a confirmatory HIV diagnosis were included in the study cohort. 543 HIV- controls with similar utilization patterns were selected using propensity score matching. Demographics, laboratory tests, and diagnosis codes were extracted from longitudinal records. Clinical notes were preprocessed using both topic modeling and an n-grams approach. We fit 3 predictive models using Random Forests including a baseline model which included only structured EHR data, the baseline model plus topic modeling, and baseline model plus clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F-measures of 0.59 for the baseline model, 0.63 for the baseline plus topic modeling and 0.74 for the baseline plus clinical keyword model. The baseline plus topic model displayed low precision but high recall while the baseline plus clinical keyword model displayed high precision but low recall. Clinical keywords including ‘msm’, ‘unprotected’, ‘hiv’, and ‘methamphetamine’ were indicative of elevated risk. CONCLUSION: Clinical notes improved the performance of predictive models for automated HIV risk assessment. Future studies should explore novel techniques for extracting social and behavioral determinants from unstructured text in longitudinal EHRs. DISCLOSURES: All authors: No reported disclosures.
format Online
Article
Text
id pubmed-5631278
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-56312782017-11-07 HIV Risk Assessment using Longitudinal Electronic Health Records Feller, Daniel Zucker, Jason Yin, Michael Gordon, Peter Elhadad, Noemie Open Forum Infect Dis Abstracts BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency Departments and across public health jurisdictions. While information on social and behavioral determinants of health are typically collected in unstructured fields, previous analyses have only considered structured EHR data. We sought to characterize whether clinical notes can improve predictive models of HIV diagnosis. METHODS: 181 individuals who received care at an academic medical center in New York City prior to a confirmatory HIV diagnosis were included in the study cohort. 543 HIV- controls with similar utilization patterns were selected using propensity score matching. Demographics, laboratory tests, and diagnosis codes were extracted from longitudinal records. Clinical notes were preprocessed using both topic modeling and an n-grams approach. We fit 3 predictive models using Random Forests including a baseline model which included only structured EHR data, the baseline model plus topic modeling, and baseline model plus clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F-measures of 0.59 for the baseline model, 0.63 for the baseline plus topic modeling and 0.74 for the baseline plus clinical keyword model. The baseline plus topic model displayed low precision but high recall while the baseline plus clinical keyword model displayed high precision but low recall. Clinical keywords including ‘msm’, ‘unprotected’, ‘hiv’, and ‘methamphetamine’ were indicative of elevated risk. CONCLUSION: Clinical notes improved the performance of predictive models for automated HIV risk assessment. Future studies should explore novel techniques for extracting social and behavioral determinants from unstructured text in longitudinal EHRs. DISCLOSURES: All authors: No reported disclosures. Oxford University Press 2017-10-04 /pmc/articles/PMC5631278/ http://dx.doi.org/10.1093/ofid/ofx163.1049 Text en © The Author 2017. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Abstracts
Feller, Daniel
Zucker, Jason
Yin, Michael
Gordon, Peter
Elhadad, Noemie
HIV Risk Assessment using Longitudinal Electronic Health Records
title HIV Risk Assessment using Longitudinal Electronic Health Records
title_full HIV Risk Assessment using Longitudinal Electronic Health Records
title_fullStr HIV Risk Assessment using Longitudinal Electronic Health Records
title_full_unstemmed HIV Risk Assessment using Longitudinal Electronic Health Records
title_short HIV Risk Assessment using Longitudinal Electronic Health Records
title_sort hiv risk assessment using longitudinal electronic health records
topic Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5631278/
http://dx.doi.org/10.1093/ofid/ofx163.1049
work_keys_str_mv AT fellerdaniel hivriskassessmentusinglongitudinalelectronichealthrecords
AT zuckerjason hivriskassessmentusinglongitudinalelectronichealthrecords
AT yinmichael hivriskassessmentusinglongitudinalelectronichealthrecords
AT gordonpeter hivriskassessmentusinglongitudinalelectronichealthrecords
AT elhadadnoemie hivriskassessmentusinglongitudinalelectronichealthrecords