Cargando…

Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records

Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic hea...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Chia-Yen, Lee, Phil H., Castro, Victor M., Minnier, Jessica, Charney, Alexander W., Stahl, Eli A., Ruderfer, Douglas M., Murphy, Shawn N., Gainer, Vivian, Cai, Tianxi, Jones, Ian, Pato, Carlos N., Pato, Michele T., Landén, Mikael, Sklar, Pamela, Perlis, Roy H., Smoller, Jordan W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5904248/
https://www.ncbi.nlm.nih.gov/pubmed/29666432
http://dx.doi.org/10.1038/s41398-018-0133-7
_version_ 1783315065063079936
author Chen, Chia-Yen
Lee, Phil H.
Castro, Victor M.
Minnier, Jessica
Charney, Alexander W.
Stahl, Eli A.
Ruderfer, Douglas M.
Murphy, Shawn N.
Gainer, Vivian
Cai, Tianxi
Jones, Ian
Pato, Carlos N.
Pato, Michele T.
Landén, Mikael
Sklar, Pamela
Perlis, Roy H.
Smoller, Jordan W.
author_facet Chen, Chia-Yen
Lee, Phil H.
Castro, Victor M.
Minnier, Jessica
Charney, Alexander W.
Stahl, Eli A.
Ruderfer, Douglas M.
Murphy, Shawn N.
Gainer, Vivian
Cai, Tianxi
Jones, Ian
Pato, Carlos N.
Pato, Michele T.
Landén, Mikael
Sklar, Pamela
Perlis, Roy H.
Smoller, Jordan W.
author_sort Chen, Chia-Yen
collection PubMed
description Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic health records (EHRs) represent a vast but relatively untapped resource for high-throughput phenotyping. As part of the International Cohort Collection for Bipolar Disorder (ICCBD), we previously validated automated EHR-based phenotyping algorithms for BD against in-person diagnostic interviews (Castro et al. Am J Psychiatry 172:363–372, 2015). Here, we establish the genetic validity of these phenotypes by determining their genetic correlation with traditionally ascertained samples. Case and control algorithms were derived from structured and narrative text in the Partners Healthcare system comprising more than 4.6 million patients over 20 years. Genomewide genotype data for 3330 BD cases and 3952 controls of European ancestry were used to estimate SNP-based heritability (h(2)(g)) and genetic correlation (r(g)) between EHR-based phenotype definitions and traditionally ascertained BD cases in GWAS by the ICCBD and Psychiatric Genomics Consortium (PGC) using LD score regression. We evaluated BD cases identified using 4 EHR-based algorithms: an NLP-based algorithm (95-NLP) and three rule-based algorithms using codified EHR with decreasing levels of stringency—“coded-strict”, “coded-broad”, and “coded-broad based on a single clinical encounter” (coded-broad-SV). The analytic sample comprised 862 95-NLP, 1968 coded-strict, 2581 coded-broad, 408 coded-broad-SV BD cases, and 3 952 controls. The estimated h(2)(g) were 0.24 (p = 0.015), 0.09 (p = 0.064), 0.13 (p = 0.003), 0.00 (p = 0.591) for 95-NLP, coded-strict, coded-broad and coded-broad-SV BD, respectively. The h(2)(g) for all EHR-based cases combined except coded-broad-SV (excluded due to 0 h(2)(g)) was 0.12 (p = 0.004). These h(2)(g) were lower or similar to the h(2)(g) observed by the ICCBD + PGCBD (0.23, p = 3.17E−80, total N = 33,181). However, the r(g) between ICCBD + PGCBD and the EHR-based cases were high for 95-NLP (0.66, p = 3.69 × 10(–5)), coded-strict (1.00, p = 2.40 × 10(−4)), and coded-broad (0.74, p = 8.11 × 10(–7)). The r(g) between EHR-based BD definitions ranged from 0.90 to 0.98. These results provide the first genetic validation of automated EHR-based phenotyping for BD and suggest that this approach identifies cases that are highly genetically correlated with those ascertained through conventional methods. High throughput phenotyping using the large data resources available in EHRs represents a viable method for accelerating psychiatric genetic research.
format Online
Article
Text
id pubmed-5904248
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-59042482018-04-20 Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records Chen, Chia-Yen Lee, Phil H. Castro, Victor M. Minnier, Jessica Charney, Alexander W. Stahl, Eli A. Ruderfer, Douglas M. Murphy, Shawn N. Gainer, Vivian Cai, Tianxi Jones, Ian Pato, Carlos N. Pato, Michele T. Landén, Mikael Sklar, Pamela Perlis, Roy H. Smoller, Jordan W. Transl Psychiatry Article Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic health records (EHRs) represent a vast but relatively untapped resource for high-throughput phenotyping. As part of the International Cohort Collection for Bipolar Disorder (ICCBD), we previously validated automated EHR-based phenotyping algorithms for BD against in-person diagnostic interviews (Castro et al. Am J Psychiatry 172:363–372, 2015). Here, we establish the genetic validity of these phenotypes by determining their genetic correlation with traditionally ascertained samples. Case and control algorithms were derived from structured and narrative text in the Partners Healthcare system comprising more than 4.6 million patients over 20 years. Genomewide genotype data for 3330 BD cases and 3952 controls of European ancestry were used to estimate SNP-based heritability (h(2)(g)) and genetic correlation (r(g)) between EHR-based phenotype definitions and traditionally ascertained BD cases in GWAS by the ICCBD and Psychiatric Genomics Consortium (PGC) using LD score regression. We evaluated BD cases identified using 4 EHR-based algorithms: an NLP-based algorithm (95-NLP) and three rule-based algorithms using codified EHR with decreasing levels of stringency—“coded-strict”, “coded-broad”, and “coded-broad based on a single clinical encounter” (coded-broad-SV). The analytic sample comprised 862 95-NLP, 1968 coded-strict, 2581 coded-broad, 408 coded-broad-SV BD cases, and 3 952 controls. The estimated h(2)(g) were 0.24 (p = 0.015), 0.09 (p = 0.064), 0.13 (p = 0.003), 0.00 (p = 0.591) for 95-NLP, coded-strict, coded-broad and coded-broad-SV BD, respectively. The h(2)(g) for all EHR-based cases combined except coded-broad-SV (excluded due to 0 h(2)(g)) was 0.12 (p = 0.004). These h(2)(g) were lower or similar to the h(2)(g) observed by the ICCBD + PGCBD (0.23, p = 3.17E−80, total N = 33,181). However, the r(g) between ICCBD + PGCBD and the EHR-based cases were high for 95-NLP (0.66, p = 3.69 × 10(–5)), coded-strict (1.00, p = 2.40 × 10(−4)), and coded-broad (0.74, p = 8.11 × 10(–7)). The r(g) between EHR-based BD definitions ranged from 0.90 to 0.98. These results provide the first genetic validation of automated EHR-based phenotyping for BD and suggest that this approach identifies cases that are highly genetically correlated with those ascertained through conventional methods. High throughput phenotyping using the large data resources available in EHRs represents a viable method for accelerating psychiatric genetic research. Nature Publishing Group UK 2018-04-18 /pmc/articles/PMC5904248/ /pubmed/29666432 http://dx.doi.org/10.1038/s41398-018-0133-7 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Chen, Chia-Yen
Lee, Phil H.
Castro, Victor M.
Minnier, Jessica
Charney, Alexander W.
Stahl, Eli A.
Ruderfer, Douglas M.
Murphy, Shawn N.
Gainer, Vivian
Cai, Tianxi
Jones, Ian
Pato, Carlos N.
Pato, Michele T.
Landén, Mikael
Sklar, Pamela
Perlis, Roy H.
Smoller, Jordan W.
Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title_full Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title_fullStr Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title_full_unstemmed Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title_short Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
title_sort genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5904248/
https://www.ncbi.nlm.nih.gov/pubmed/29666432
http://dx.doi.org/10.1038/s41398-018-0133-7
work_keys_str_mv AT chenchiayen geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT leephilh geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT castrovictorm geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT minnierjessica geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT charneyalexanderw geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT stahlelia geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT ruderferdouglasm geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT murphyshawnn geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT gainervivian geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT caitianxi geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT jonesian geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT patocarlosn geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT patomichelet geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT landenmikael geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT sklarpamela geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT perlisroyh geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords
AT smollerjordanw geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords