Cargando…

Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies

BACKGROUND: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition...

Descripción completa

Detalles Bibliográficos
Autores principales: Jackson, Kathryn L., Mbagwu, Michael, Pacheco, Jennifer A., Baldridge, Abigail S., Viox, Daniel J., Linneman, James G., Shukla, Sanjay K., Peissig, Peggy L., Borthwick, Kenneth M., Carrell, David A., Bielinski, Suzette J., Kirby, Jacqueline C., Denny, Joshua C., Mentch, Frank D., Vazquez, Lyam M., Rasmussen-Torvik, Laura J., Kho, Abel N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5114817/
https://www.ncbi.nlm.nih.gov/pubmed/27855652
http://dx.doi.org/10.1186/s12879-016-2020-2
Descripción
Sumario:BACKGROUND: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. METHODS: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. RESULTS: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p < 5 E −8) findings. CONCLUSIONS: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12879-016-2020-2) contains supplementary material, which is available to authorized users.