Cargando…
HIV Risk Assessment using Longitudinal Electronic Health Records
BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5631278/ http://dx.doi.org/10.1093/ofid/ofx163.1049 |
_version_ | 1783269428635369472 |
---|---|
author | Feller, Daniel Zucker, Jason Yin, Michael Gordon, Peter Elhadad, Noemie |
author_facet | Feller, Daniel Zucker, Jason Yin, Michael Gordon, Peter Elhadad, Noemie |
author_sort | Feller, Daniel |
collection | PubMed |
description | BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency Departments and across public health jurisdictions. While information on social and behavioral determinants of health are typically collected in unstructured fields, previous analyses have only considered structured EHR data. We sought to characterize whether clinical notes can improve predictive models of HIV diagnosis. METHODS: 181 individuals who received care at an academic medical center in New York City prior to a confirmatory HIV diagnosis were included in the study cohort. 543 HIV- controls with similar utilization patterns were selected using propensity score matching. Demographics, laboratory tests, and diagnosis codes were extracted from longitudinal records. Clinical notes were preprocessed using both topic modeling and an n-grams approach. We fit 3 predictive models using Random Forests including a baseline model which included only structured EHR data, the baseline model plus topic modeling, and baseline model plus clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F-measures of 0.59 for the baseline model, 0.63 for the baseline plus topic modeling and 0.74 for the baseline plus clinical keyword model. The baseline plus topic model displayed low precision but high recall while the baseline plus clinical keyword model displayed high precision but low recall. Clinical keywords including ‘msm’, ‘unprotected’, ‘hiv’, and ‘methamphetamine’ were indicative of elevated risk. CONCLUSION: Clinical notes improved the performance of predictive models for automated HIV risk assessment. Future studies should explore novel techniques for extracting social and behavioral determinants from unstructured text in longitudinal EHRs. DISCLOSURES: All authors: No reported disclosures. |
format | Online Article Text |
id | pubmed-5631278 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-56312782017-11-07 HIV Risk Assessment using Longitudinal Electronic Health Records Feller, Daniel Zucker, Jason Yin, Michael Gordon, Peter Elhadad, Noemie Open Forum Infect Dis Abstracts BACKGROUND: Universal HIV screening programs are costly, labor-intensive, and in practice unable to identify all individuals at risk of HIV infection. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs in Emergency Departments and across public health jurisdictions. While information on social and behavioral determinants of health are typically collected in unstructured fields, previous analyses have only considered structured EHR data. We sought to characterize whether clinical notes can improve predictive models of HIV diagnosis. METHODS: 181 individuals who received care at an academic medical center in New York City prior to a confirmatory HIV diagnosis were included in the study cohort. 543 HIV- controls with similar utilization patterns were selected using propensity score matching. Demographics, laboratory tests, and diagnosis codes were extracted from longitudinal records. Clinical notes were preprocessed using both topic modeling and an n-grams approach. We fit 3 predictive models using Random Forests including a baseline model which included only structured EHR data, the baseline model plus topic modeling, and baseline model plus clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F-measures of 0.59 for the baseline model, 0.63 for the baseline plus topic modeling and 0.74 for the baseline plus clinical keyword model. The baseline plus topic model displayed low precision but high recall while the baseline plus clinical keyword model displayed high precision but low recall. Clinical keywords including ‘msm’, ‘unprotected’, ‘hiv’, and ‘methamphetamine’ were indicative of elevated risk. CONCLUSION: Clinical notes improved the performance of predictive models for automated HIV risk assessment. Future studies should explore novel techniques for extracting social and behavioral determinants from unstructured text in longitudinal EHRs. DISCLOSURES: All authors: No reported disclosures. Oxford University Press 2017-10-04 /pmc/articles/PMC5631278/ http://dx.doi.org/10.1093/ofid/ofx163.1049 Text en © The Author 2017. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Abstracts Feller, Daniel Zucker, Jason Yin, Michael Gordon, Peter Elhadad, Noemie HIV Risk Assessment using Longitudinal Electronic Health Records |
title | HIV Risk Assessment using Longitudinal Electronic Health Records |
title_full | HIV Risk Assessment using Longitudinal Electronic Health Records |
title_fullStr | HIV Risk Assessment using Longitudinal Electronic Health Records |
title_full_unstemmed | HIV Risk Assessment using Longitudinal Electronic Health Records |
title_short | HIV Risk Assessment using Longitudinal Electronic Health Records |
title_sort | hiv risk assessment using longitudinal electronic health records |
topic | Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5631278/ http://dx.doi.org/10.1093/ofid/ofx163.1049 |
work_keys_str_mv | AT fellerdaniel hivriskassessmentusinglongitudinalelectronichealthrecords AT zuckerjason hivriskassessmentusinglongitudinalelectronichealthrecords AT yinmichael hivriskassessmentusinglongitudinalelectronichealthrecords AT gordonpeter hivriskassessmentusinglongitudinalelectronichealthrecords AT elhadadnoemie hivriskassessmentusinglongitudinalelectronichealthrecords |