Cargando…

414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database

BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) datas...

Descripción completa

Detalles Bibliográficos
Autores principales: Meister, Leo, Zerbe, Christa, Notarangelo, Luigi D, Kadri, Sameer S, Rebecca Prevots, D, Ricotta, Emily
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6809140/
http://dx.doi.org/10.1093/ofid/ofz360.487
_version_ 1783461911838326784
author Meister, Leo
Zerbe, Christa
Notarangelo, Luigi D
Kadri, Sameer S
Rebecca Prevots, D
Ricotta, Emily
author_facet Meister, Leo
Zerbe, Christa
Notarangelo, Luigi D
Kadri, Sameer S
Rebecca Prevots, D
Ricotta, Emily
author_sort Meister, Leo
collection PubMed
description BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. METHODS: Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. RESULTS: Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). CONCLUSION: Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. [Image: see text] [Image: see text] [Image: see text] DISCLOSURES: All authors: No reported disclosures.
format Online
Article
Text
id pubmed-6809140
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-68091402019-10-28 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database Meister, Leo Zerbe, Christa Notarangelo, Luigi D Kadri, Sameer S Rebecca Prevots, D Ricotta, Emily Open Forum Infect Dis Abstracts BACKGROUND: More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. METHODS: Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. RESULTS: Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). CONCLUSION: Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. [Image: see text] [Image: see text] [Image: see text] DISCLOSURES: All authors: No reported disclosures. Oxford University Press 2019-10-23 /pmc/articles/PMC6809140/ http://dx.doi.org/10.1093/ofid/ofz360.487 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Abstracts
Meister, Leo
Zerbe, Christa
Notarangelo, Luigi D
Kadri, Sameer S
Rebecca Prevots, D
Ricotta, Emily
414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title_full 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title_fullStr 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title_full_unstemmed 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title_short 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
title_sort 414. developing digital phenotypes of primary immune deficiencies using machine learning on a large electronic health record database
topic Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6809140/
http://dx.doi.org/10.1093/ofid/ofz360.487
work_keys_str_mv AT meisterleo 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase
AT zerbechrista 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase
AT notarangeloluigid 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase
AT kadrisameers 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase
AT rebeccaprevotsd 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase
AT ricottaemily 414developingdigitalphenotypesofprimaryimmunedeficienciesusingmachinelearningonalargeelectronichealthrecorddatabase