Cargando…

A Bayesian latent class approach for EHR‐based phenotyping

Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more de...

Descripción completa

Detalles Bibliográficos
Autores principales: Hubbard, Rebecca A., Huang, Jing, Harton, Joanna, Oganisian, Arman, Choi, Grace, Utidjian, Levon, Eneli, Ihuoma, Bailey, L. Charles, Chen, Yong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6519239/
https://www.ncbi.nlm.nih.gov/pubmed/30252148
http://dx.doi.org/10.1002/sim.7953
_version_ 1783418605234290688
author Hubbard, Rebecca A.
Huang, Jing
Harton, Joanna
Oganisian, Arman
Choi, Grace
Utidjian, Levon
Eneli, Ihuoma
Bailey, L. Charles
Chen, Yong
author_facet Hubbard, Rebecca A.
Huang, Jing
Harton, Joanna
Oganisian, Arman
Choi, Grace
Utidjian, Levon
Eneli, Ihuoma
Bailey, L. Charles
Chen, Yong
author_sort Hubbard, Rebecca A.
collection PubMed
description Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more detailed data such as laboratory test results. Despite these challenges, most existing electronic health records–derived phenotypes are rule‐based, consisting of a series of Boolean arguments informed by expert knowledge of the disease of interest and its coding. The objective of this paper is to introduce a Bayesian latent phenotyping approach that accounts for imperfect data elements and missing not at random missingness patterns that can be used when no gold‐standard data are available. We conducted simulation studies to compare alternative phenotyping methods under different patterns of missingness and applied these approaches to a cohort of 68 265 children at elevated risk for type 2 diabetes mellitus (T2DM). In simulation studies, the latent class approach had similar sensitivity to a rule‐based approach (95.9% vs 91.9%) while substantially improving specificity (99.7% vs 90.8%). In the PEDSnet cohort, we found that biomarkers and clinical codes were strongly associated with latent T2DM status. The latent T2DM class was also strongly predictive of missingness in biomarkers. Glucose was missing in 83.4% of patients (odds ratio for latent T2DM status = 0.52) while hemoglobin A1c was missing in 91.2% (odds ratio for latent T2DM status = 0.03 ), suggesting missing not at random missingness. The latent phenotype approach may substantially improve on rule‐based phenotyping.
format Online
Article
Text
id pubmed-6519239
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-65192392019-05-21 A Bayesian latent class approach for EHR‐based phenotyping Hubbard, Rebecca A. Huang, Jing Harton, Joanna Oganisian, Arman Choi, Grace Utidjian, Levon Eneli, Ihuoma Bailey, L. Charles Chen, Yong Stat Med Research Articles Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more detailed data such as laboratory test results. Despite these challenges, most existing electronic health records–derived phenotypes are rule‐based, consisting of a series of Boolean arguments informed by expert knowledge of the disease of interest and its coding. The objective of this paper is to introduce a Bayesian latent phenotyping approach that accounts for imperfect data elements and missing not at random missingness patterns that can be used when no gold‐standard data are available. We conducted simulation studies to compare alternative phenotyping methods under different patterns of missingness and applied these approaches to a cohort of 68 265 children at elevated risk for type 2 diabetes mellitus (T2DM). In simulation studies, the latent class approach had similar sensitivity to a rule‐based approach (95.9% vs 91.9%) while substantially improving specificity (99.7% vs 90.8%). In the PEDSnet cohort, we found that biomarkers and clinical codes were strongly associated with latent T2DM status. The latent T2DM class was also strongly predictive of missingness in biomarkers. Glucose was missing in 83.4% of patients (odds ratio for latent T2DM status = 0.52) while hemoglobin A1c was missing in 91.2% (odds ratio for latent T2DM status = 0.03 ), suggesting missing not at random missingness. The latent phenotype approach may substantially improve on rule‐based phenotyping. John Wiley and Sons Inc. 2018-09-03 2019-01-15 /pmc/articles/PMC6519239/ /pubmed/30252148 http://dx.doi.org/10.1002/sim.7953 Text en © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons, Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
Hubbard, Rebecca A.
Huang, Jing
Harton, Joanna
Oganisian, Arman
Choi, Grace
Utidjian, Levon
Eneli, Ihuoma
Bailey, L. Charles
Chen, Yong
A Bayesian latent class approach for EHR‐based phenotyping
title A Bayesian latent class approach for EHR‐based phenotyping
title_full A Bayesian latent class approach for EHR‐based phenotyping
title_fullStr A Bayesian latent class approach for EHR‐based phenotyping
title_full_unstemmed A Bayesian latent class approach for EHR‐based phenotyping
title_short A Bayesian latent class approach for EHR‐based phenotyping
title_sort bayesian latent class approach for ehr‐based phenotyping
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6519239/
https://www.ncbi.nlm.nih.gov/pubmed/30252148
http://dx.doi.org/10.1002/sim.7953
work_keys_str_mv AT hubbardrebeccaa abayesianlatentclassapproachforehrbasedphenotyping
AT huangjing abayesianlatentclassapproachforehrbasedphenotyping
AT hartonjoanna abayesianlatentclassapproachforehrbasedphenotyping
AT oganisianarman abayesianlatentclassapproachforehrbasedphenotyping
AT choigrace abayesianlatentclassapproachforehrbasedphenotyping
AT utidjianlevon abayesianlatentclassapproachforehrbasedphenotyping
AT eneliihuoma abayesianlatentclassapproachforehrbasedphenotyping
AT baileylcharles abayesianlatentclassapproachforehrbasedphenotyping
AT chenyong abayesianlatentclassapproachforehrbasedphenotyping
AT hubbardrebeccaa bayesianlatentclassapproachforehrbasedphenotyping
AT huangjing bayesianlatentclassapproachforehrbasedphenotyping
AT hartonjoanna bayesianlatentclassapproachforehrbasedphenotyping
AT oganisianarman bayesianlatentclassapproachforehrbasedphenotyping
AT choigrace bayesianlatentclassapproachforehrbasedphenotyping
AT utidjianlevon bayesianlatentclassapproachforehrbasedphenotyping
AT eneliihuoma bayesianlatentclassapproachforehrbasedphenotyping
AT baileylcharles bayesianlatentclassapproachforehrbasedphenotyping
AT chenyong bayesianlatentclassapproachforehrbasedphenotyping