Cargando…
414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) datas...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6809140/ http://dx.doi.org/10.1093/ofid/ofz360.487 |
_version_ | 1783461911838326784 |
---|---|
author | Meister, Leo Zerbe, Christa Notarangelo, Luigi D Kadri, Sameer S Rebecca Prevots, D Ricotta, Emily |
author_facet | Meister, Leo Zerbe, Christa Notarangelo, Luigi D Kadri, Sameer S Rebecca Prevots, D Ricotta, Emily |
author_sort | Meister, Leo |
collection | PubMed |
description | BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. METHODS: Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. RESULTS: Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). CONCLUSION: Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. [Image: see text] [Image: see text] [Image: see text] DISCLOSURES: All authors: No reported disclosures. |
format | Online Article Text |
id | pubmed-6809140 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-68091402019-10-28 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database Meister, Leo Zerbe, Christa Notarangelo, Luigi D Kadri, Sameer S Rebecca Prevots, D Ricotta, Emily Open Forum Infect Dis Abstracts BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. METHODS: Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. RESULTS: Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). CONCLUSION: Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. [Image: see text] [Image: see text] [Image: see text] DISCLOSURES: All authors: No reported disclosures. Oxford University Press 2019-10-23 /pmc/articles/PMC6809140/ http://dx.doi.org/10.1093/ofid/ofz360.487 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Abstracts Meister, Leo Zerbe, Christa Notarangelo, Luigi D Kadri, Sameer S Rebecca Prevots, D Ricotta, Emily 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title | 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title_full | 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title_fullStr | 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title_full_unstemmed | 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title_short | 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database |
title_sort | 414. developing digital phenotypes of primary immune deficiencies using machine learning on a large electronic health record database |
topic | Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6809140/ http://dx.doi.org/10.1093/ofid/ofz360.487 |
work_keys_str_mv | AT meisterleo 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase AT zerbechrista 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase AT notarangeloluigid 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase AT kadrisameers 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase AT rebeccaprevotsd 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase AT ricottaemily 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase |