Cargando…

Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data

The ‘discovery’ stage of genome-wide association studies required amassing large, homogeneous cohorts. In order to attain clinically useful insights, we must now consider the presentation of disease within our clinics and, by extension, within our medical records. Large-scale use of electronic healt...

Descripción completa

Detalles Bibliográficos
Autores principales: Dueñas, Hillary R, Seah, Carina, Johnson, Jessica S, Huckins, Laura M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7530523/
https://www.ncbi.nlm.nih.gov/pubmed/32879975
http://dx.doi.org/10.1093/hmg/ddaa192
_version_ 1783589586628247552
author Dueñas, Hillary R
Seah, Carina
Johnson, Jessica S
Huckins, Laura M
author_facet Dueñas, Hillary R
Seah, Carina
Johnson, Jessica S
Huckins, Laura M
author_sort Dueñas, Hillary R
collection PubMed
description The ‘discovery’ stage of genome-wide association studies required amassing large, homogeneous cohorts. In order to attain clinically useful insights, we must now consider the presentation of disease within our clinics and, by extension, within our medical records. Large-scale use of electronic health record (EHR) data can help to understand phenotypes in a scalable manner, incorporating lifelong and whole-phenome context. However, extending analyses to incorporate EHR and biobank-based analyses will require careful consideration of phenotype definition. Judgements and clinical decisions that occur ‘outside’ the system inevitably contain some degree of bias and become encoded in EHR data. Any algorithmic approach to phenotypic characterization that assumes non-biased variables will generate compounded biased conclusions. Here, we discuss and illustrate potential biases inherent within EHR analyses, how these may be compounded across time and suggest frameworks for large-scale phenotypic analysis to minimize and uncover encoded bias.
format Online
Article
Text
id pubmed-7530523
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-75305232020-10-07 Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data Dueñas, Hillary R Seah, Carina Johnson, Jessica S Huckins, Laura M Hum Mol Genet Invited Review Article The ‘discovery’ stage of genome-wide association studies required amassing large, homogeneous cohorts. In order to attain clinically useful insights, we must now consider the presentation of disease within our clinics and, by extension, within our medical records. Large-scale use of electronic health record (EHR) data can help to understand phenotypes in a scalable manner, incorporating lifelong and whole-phenome context. However, extending analyses to incorporate EHR and biobank-based analyses will require careful consideration of phenotype definition. Judgements and clinical decisions that occur ‘outside’ the system inevitably contain some degree of bias and become encoded in EHR data. Any algorithmic approach to phenotypic characterization that assumes non-biased variables will generate compounded biased conclusions. Here, we discuss and illustrate potential biases inherent within EHR analyses, how these may be compounded across time and suggest frameworks for large-scale phenotypic analysis to minimize and uncover encoded bias. Oxford University Press 2020-09-02 /pmc/articles/PMC7530523/ /pubmed/32879975 http://dx.doi.org/10.1093/hmg/ddaa192 Text en © The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Invited Review Article
Dueñas, Hillary R
Seah, Carina
Johnson, Jessica S
Huckins, Laura M
Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title_full Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title_fullStr Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title_full_unstemmed Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title_short Implicit bias of encoded variables: frameworks for addressing structured bias in EHR–GWAS data
title_sort implicit bias of encoded variables: frameworks for addressing structured bias in ehr–gwas data
topic Invited Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7530523/
https://www.ncbi.nlm.nih.gov/pubmed/32879975
http://dx.doi.org/10.1093/hmg/ddaa192
work_keys_str_mv AT duenashillaryr implicitbiasofencodedvariablesframeworksforaddressingstructuredbiasinehrgwasdata
AT seahcarina implicitbiasofencodedvariablesframeworksforaddressingstructuredbiasinehrgwasdata
AT johnsonjessicas implicitbiasofencodedvariablesframeworksforaddressingstructuredbiasinehrgwasdata
AT huckinslauram implicitbiasofencodedvariablesframeworksforaddressingstructuredbiasinehrgwasdata