Cargando…

The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study

BACKGROUND: Biorepositories linked to de-identified electronic medical records (EMRs) have the potential to complement traditional epidemiologic studies in genotype-phenotype studies of complex human diseases and traits. A major challenge in meeting this potential is the use of EMR-derived data to e...

Descripción completa

Detalles Bibliográficos
Autores principales: Dumitrescu, Logan, Goodloe, Robert, Bradford, Yukiko, Farber-Eger, Eric, Boston, Jonathan, Crawford, Dana C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428098/
https://www.ncbi.nlm.nih.gov/pubmed/25969697
http://dx.doi.org/10.1186/s13040-015-0048-2
_version_ 1782370838334406656
author Dumitrescu, Logan
Goodloe, Robert
Bradford, Yukiko
Farber-Eger, Eric
Boston, Jonathan
Crawford, Dana C
author_facet Dumitrescu, Logan
Goodloe, Robert
Bradford, Yukiko
Farber-Eger, Eric
Boston, Jonathan
Crawford, Dana C
author_sort Dumitrescu, Logan
collection PubMed
description BACKGROUND: Biorepositories linked to de-identified electronic medical records (EMRs) have the potential to complement traditional epidemiologic studies in genotype-phenotype studies of complex human diseases and traits. A major challenge in meeting this potential is the use of EMR-derived data to extract phenotypes and covariates for genetic association studies. Unlike traditional epidemiologic data, EMR-derived data are collected for clinical care and are therefore highly variable across patients. The variability of clinical data coupled with the challenges associated with searching unstructured clinical notes requires the development of algorithms to extract phenotypes for analysis. Given the number of possible algorithms that could be developed for any one EMR-derived phenotype, we explored here the impact algorithm decision logic has on genetic association study results for a single quantitative trait, high density lipoprotein cholesterol (HDL-C). RESULTS: We used five different algorithms to extract HDL-C from African American subjects genotyped on the Illumina Metabochip (n = 11,519) as part of Epidemiologic Architecture for Genes Linked to Environment (EAGLE). Tests of association between HDL-C and genetic risk scores for HDL-C associated variants suggest that the genetic effect size does not vary substantially across the five HDL-C definitions. CONCLUSIONS: These data collectively suggest that, at least for this quantitative trait, algorithm decision logic and phenotyping details do not appreciably impact genetic association study test statistics.
format Online
Article
Text
id pubmed-4428098
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44280982015-05-13 The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study Dumitrescu, Logan Goodloe, Robert Bradford, Yukiko Farber-Eger, Eric Boston, Jonathan Crawford, Dana C BioData Min Research BACKGROUND: Biorepositories linked to de-identified electronic medical records (EMRs) have the potential to complement traditional epidemiologic studies in genotype-phenotype studies of complex human diseases and traits. A major challenge in meeting this potential is the use of EMR-derived data to extract phenotypes and covariates for genetic association studies. Unlike traditional epidemiologic data, EMR-derived data are collected for clinical care and are therefore highly variable across patients. The variability of clinical data coupled with the challenges associated with searching unstructured clinical notes requires the development of algorithms to extract phenotypes for analysis. Given the number of possible algorithms that could be developed for any one EMR-derived phenotype, we explored here the impact algorithm decision logic has on genetic association study results for a single quantitative trait, high density lipoprotein cholesterol (HDL-C). RESULTS: We used five different algorithms to extract HDL-C from African American subjects genotyped on the Illumina Metabochip (n = 11,519) as part of Epidemiologic Architecture for Genes Linked to Environment (EAGLE). Tests of association between HDL-C and genetic risk scores for HDL-C associated variants suggest that the genetic effect size does not vary substantially across the five HDL-C definitions. CONCLUSIONS: These data collectively suggest that, at least for this quantitative trait, algorithm decision logic and phenotyping details do not appreciably impact genetic association study test statistics. BioMed Central 2015-05-06 /pmc/articles/PMC4428098/ /pubmed/25969697 http://dx.doi.org/10.1186/s13040-015-0048-2 Text en © Dumitrescu et al.; licensee BioMed Central. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Dumitrescu, Logan
Goodloe, Robert
Bradford, Yukiko
Farber-Eger, Eric
Boston, Jonathan
Crawford, Dana C
The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title_full The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title_fullStr The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title_full_unstemmed The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title_short The effects of electronic medical record phenotyping details on genetic association studies: HDL-C as a case study
title_sort effects of electronic medical record phenotyping details on genetic association studies: hdl-c as a case study
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428098/
https://www.ncbi.nlm.nih.gov/pubmed/25969697
http://dx.doi.org/10.1186/s13040-015-0048-2
work_keys_str_mv AT dumitresculogan theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT goodloerobert theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT bradfordyukiko theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT farberegereric theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT bostonjonathan theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT crawforddanac theeffectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT dumitresculogan effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT goodloerobert effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT bradfordyukiko effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT farberegereric effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT bostonjonathan effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy
AT crawforddanac effectsofelectronicmedicalrecordphenotypingdetailsongeneticassociationstudieshdlcasacasestudy