Cargando…
Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records
Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic hea...
Autores principales: | , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5904248/ https://www.ncbi.nlm.nih.gov/pubmed/29666432 http://dx.doi.org/10.1038/s41398-018-0133-7 |
_version_ | 1783315065063079936 |
---|---|
author | Chen, Chia-Yen Lee, Phil H. Castro, Victor M. Minnier, Jessica Charney, Alexander W. Stahl, Eli A. Ruderfer, Douglas M. Murphy, Shawn N. Gainer, Vivian Cai, Tianxi Jones, Ian Pato, Carlos N. Pato, Michele T. Landén, Mikael Sklar, Pamela Perlis, Roy H. Smoller, Jordan W. |
author_facet | Chen, Chia-Yen Lee, Phil H. Castro, Victor M. Minnier, Jessica Charney, Alexander W. Stahl, Eli A. Ruderfer, Douglas M. Murphy, Shawn N. Gainer, Vivian Cai, Tianxi Jones, Ian Pato, Carlos N. Pato, Michele T. Landén, Mikael Sklar, Pamela Perlis, Roy H. Smoller, Jordan W. |
author_sort | Chen, Chia-Yen |
collection | PubMed |
description | Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic health records (EHRs) represent a vast but relatively untapped resource for high-throughput phenotyping. As part of the International Cohort Collection for Bipolar Disorder (ICCBD), we previously validated automated EHR-based phenotyping algorithms for BD against in-person diagnostic interviews (Castro et al. Am J Psychiatry 172:363–372, 2015). Here, we establish the genetic validity of these phenotypes by determining their genetic correlation with traditionally ascertained samples. Case and control algorithms were derived from structured and narrative text in the Partners Healthcare system comprising more than 4.6 million patients over 20 years. Genomewide genotype data for 3330 BD cases and 3952 controls of European ancestry were used to estimate SNP-based heritability (h(2)(g)) and genetic correlation (r(g)) between EHR-based phenotype definitions and traditionally ascertained BD cases in GWAS by the ICCBD and Psychiatric Genomics Consortium (PGC) using LD score regression. We evaluated BD cases identified using 4 EHR-based algorithms: an NLP-based algorithm (95-NLP) and three rule-based algorithms using codified EHR with decreasing levels of stringency—“coded-strict”, “coded-broad”, and “coded-broad based on a single clinical encounter” (coded-broad-SV). The analytic sample comprised 862 95-NLP, 1968 coded-strict, 2581 coded-broad, 408 coded-broad-SV BD cases, and 3 952 controls. The estimated h(2)(g) were 0.24 (p = 0.015), 0.09 (p = 0.064), 0.13 (p = 0.003), 0.00 (p = 0.591) for 95-NLP, coded-strict, coded-broad and coded-broad-SV BD, respectively. The h(2)(g) for all EHR-based cases combined except coded-broad-SV (excluded due to 0 h(2)(g)) was 0.12 (p = 0.004). These h(2)(g) were lower or similar to the h(2)(g) observed by the ICCBD + PGCBD (0.23, p = 3.17E−80, total N = 33,181). However, the r(g) between ICCBD + PGCBD and the EHR-based cases were high for 95-NLP (0.66, p = 3.69 × 10(–5)), coded-strict (1.00, p = 2.40 × 10(−4)), and coded-broad (0.74, p = 8.11 × 10(–7)). The r(g) between EHR-based BD definitions ranged from 0.90 to 0.98. These results provide the first genetic validation of automated EHR-based phenotyping for BD and suggest that this approach identifies cases that are highly genetically correlated with those ascertained through conventional methods. High throughput phenotyping using the large data resources available in EHRs represents a viable method for accelerating psychiatric genetic research. |
format | Online Article Text |
id | pubmed-5904248 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-59042482018-04-20 Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records Chen, Chia-Yen Lee, Phil H. Castro, Victor M. Minnier, Jessica Charney, Alexander W. Stahl, Eli A. Ruderfer, Douglas M. Murphy, Shawn N. Gainer, Vivian Cai, Tianxi Jones, Ian Pato, Carlos N. Pato, Michele T. Landén, Mikael Sklar, Pamela Perlis, Roy H. Smoller, Jordan W. Transl Psychiatry Article Bipolar disorder (BD) is a heritable mood disorder characterized by episodes of mania and depression. Although genomewide association studies (GWAS) have successfully identified genetic loci contributing to BD risk, sample size has become a rate-limiting obstacle to genetic discovery. Electronic health records (EHRs) represent a vast but relatively untapped resource for high-throughput phenotyping. As part of the International Cohort Collection for Bipolar Disorder (ICCBD), we previously validated automated EHR-based phenotyping algorithms for BD against in-person diagnostic interviews (Castro et al. Am J Psychiatry 172:363–372, 2015). Here, we establish the genetic validity of these phenotypes by determining their genetic correlation with traditionally ascertained samples. Case and control algorithms were derived from structured and narrative text in the Partners Healthcare system comprising more than 4.6 million patients over 20 years. Genomewide genotype data for 3330 BD cases and 3952 controls of European ancestry were used to estimate SNP-based heritability (h(2)(g)) and genetic correlation (r(g)) between EHR-based phenotype definitions and traditionally ascertained BD cases in GWAS by the ICCBD and Psychiatric Genomics Consortium (PGC) using LD score regression. We evaluated BD cases identified using 4 EHR-based algorithms: an NLP-based algorithm (95-NLP) and three rule-based algorithms using codified EHR with decreasing levels of stringency—“coded-strict”, “coded-broad”, and “coded-broad based on a single clinical encounter” (coded-broad-SV). The analytic sample comprised 862 95-NLP, 1968 coded-strict, 2581 coded-broad, 408 coded-broad-SV BD cases, and 3 952 controls. The estimated h(2)(g) were 0.24 (p = 0.015), 0.09 (p = 0.064), 0.13 (p = 0.003), 0.00 (p = 0.591) for 95-NLP, coded-strict, coded-broad and coded-broad-SV BD, respectively. The h(2)(g) for all EHR-based cases combined except coded-broad-SV (excluded due to 0 h(2)(g)) was 0.12 (p = 0.004). These h(2)(g) were lower or similar to the h(2)(g) observed by the ICCBD + PGCBD (0.23, p = 3.17E−80, total N = 33,181). However, the r(g) between ICCBD + PGCBD and the EHR-based cases were high for 95-NLP (0.66, p = 3.69 × 10(–5)), coded-strict (1.00, p = 2.40 × 10(−4)), and coded-broad (0.74, p = 8.11 × 10(–7)). The r(g) between EHR-based BD definitions ranged from 0.90 to 0.98. These results provide the first genetic validation of automated EHR-based phenotyping for BD and suggest that this approach identifies cases that are highly genetically correlated with those ascertained through conventional methods. High throughput phenotyping using the large data resources available in EHRs represents a viable method for accelerating psychiatric genetic research. Nature Publishing Group UK 2018-04-18 /pmc/articles/PMC5904248/ /pubmed/29666432 http://dx.doi.org/10.1038/s41398-018-0133-7 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Chen, Chia-Yen Lee, Phil H. Castro, Victor M. Minnier, Jessica Charney, Alexander W. Stahl, Eli A. Ruderfer, Douglas M. Murphy, Shawn N. Gainer, Vivian Cai, Tianxi Jones, Ian Pato, Carlos N. Pato, Michele T. Landén, Mikael Sklar, Pamela Perlis, Roy H. Smoller, Jordan W. Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title | Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title_full | Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title_fullStr | Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title_full_unstemmed | Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title_short | Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
title_sort | genetic validation of bipolar disorder identified by automated phenotyping using electronic health records |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5904248/ https://www.ncbi.nlm.nih.gov/pubmed/29666432 http://dx.doi.org/10.1038/s41398-018-0133-7 |
work_keys_str_mv | AT chenchiayen geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT leephilh geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT castrovictorm geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT minnierjessica geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT charneyalexanderw geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT stahlelia geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT ruderferdouglasm geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT murphyshawnn geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT gainervivian geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT caitianxi geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT jonesian geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT patocarlosn geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT patomichelet geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT landenmikael geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT sklarpamela geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT perlisroyh geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords AT smollerjordanw geneticvalidationofbipolardisorderidentifiedbyautomatedphenotypingusingelectronichealthrecords |