Cargando…
The phenotype-genotype reference map: Improving biobank data science through replication
Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, w...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10502848/ https://www.ncbi.nlm.nih.gov/pubmed/37607538 http://dx.doi.org/10.1016/j.ajhg.2023.07.012 |
_version_ | 1785106403433644032 |
---|---|
author | Bastarache, Lisa Delozier, Sarah Pandit, Anita He, Jing Lewis, Adam Annis, Aubrey C. LeFaive, Jonathon Denny, Joshua C. Carroll, Robert J. Altman, Russ B. Hughey, Jacob J. Zawistowski, Matthew Peterson, Josh F. |
author_facet | Bastarache, Lisa Delozier, Sarah Pandit, Anita He, Jing Lewis, Adam Annis, Aubrey C. LeFaive, Jonathon Denny, Joshua C. Carroll, Robert J. Altman, Russ B. Hughey, Jacob J. Zawistowski, Matthew Peterson, Josh F. |
author_sort | Bastarache, Lisa |
collection | PubMed |
description | Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results. |
format | Online Article Text |
id | pubmed-10502848 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-105028482023-09-16 The phenotype-genotype reference map: Improving biobank data science through replication Bastarache, Lisa Delozier, Sarah Pandit, Anita He, Jing Lewis, Adam Annis, Aubrey C. LeFaive, Jonathon Denny, Joshua C. Carroll, Robert J. Altman, Russ B. Hughey, Jacob J. Zawistowski, Matthew Peterson, Josh F. Am J Hum Genet Article Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results. Elsevier 2023-09-07 2023-08-21 /pmc/articles/PMC10502848/ /pubmed/37607538 http://dx.doi.org/10.1016/j.ajhg.2023.07.012 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Bastarache, Lisa Delozier, Sarah Pandit, Anita He, Jing Lewis, Adam Annis, Aubrey C. LeFaive, Jonathon Denny, Joshua C. Carroll, Robert J. Altman, Russ B. Hughey, Jacob J. Zawistowski, Matthew Peterson, Josh F. The phenotype-genotype reference map: Improving biobank data science through replication |
title | The phenotype-genotype reference map: Improving biobank data science through replication |
title_full | The phenotype-genotype reference map: Improving biobank data science through replication |
title_fullStr | The phenotype-genotype reference map: Improving biobank data science through replication |
title_full_unstemmed | The phenotype-genotype reference map: Improving biobank data science through replication |
title_short | The phenotype-genotype reference map: Improving biobank data science through replication |
title_sort | phenotype-genotype reference map: improving biobank data science through replication |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10502848/ https://www.ncbi.nlm.nih.gov/pubmed/37607538 http://dx.doi.org/10.1016/j.ajhg.2023.07.012 |
work_keys_str_mv | AT bastarachelisa thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT deloziersarah thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT panditanita thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT hejing thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT lewisadam thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT annisaubreyc thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT lefaivejonathon thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT dennyjoshuac thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT carrollrobertj thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT altmanrussb thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT hugheyjacobj thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT zawistowskimatthew thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT petersonjoshf thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT bastarachelisa phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT deloziersarah phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT panditanita phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT hejing phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT lewisadam phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT annisaubreyc phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT lefaivejonathon phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT dennyjoshuac phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT carrollrobertj phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT altmanrussb phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT hugheyjacobj phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT zawistowskimatthew phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication AT petersonjoshf phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication |