Cargando…

The phenotype-genotype reference map: Improving biobank data science through replication

Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Bastarache, Lisa, Delozier, Sarah, Pandit, Anita, He, Jing, Lewis, Adam, Annis, Aubrey C., LeFaive, Jonathon, Denny, Joshua C., Carroll, Robert J., Altman, Russ B., Hughey, Jacob J., Zawistowski, Matthew, Peterson, Josh F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10502848/
https://www.ncbi.nlm.nih.gov/pubmed/37607538
http://dx.doi.org/10.1016/j.ajhg.2023.07.012
_version_ 1785106403433644032
author Bastarache, Lisa
Delozier, Sarah
Pandit, Anita
He, Jing
Lewis, Adam
Annis, Aubrey C.
LeFaive, Jonathon
Denny, Joshua C.
Carroll, Robert J.
Altman, Russ B.
Hughey, Jacob J.
Zawistowski, Matthew
Peterson, Josh F.
author_facet Bastarache, Lisa
Delozier, Sarah
Pandit, Anita
He, Jing
Lewis, Adam
Annis, Aubrey C.
LeFaive, Jonathon
Denny, Joshua C.
Carroll, Robert J.
Altman, Russ B.
Hughey, Jacob J.
Zawistowski, Matthew
Peterson, Josh F.
author_sort Bastarache, Lisa
collection PubMed
description Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results.
format Online
Article
Text
id pubmed-10502848
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105028482023-09-16 The phenotype-genotype reference map: Improving biobank data science through replication Bastarache, Lisa Delozier, Sarah Pandit, Anita He, Jing Lewis, Adam Annis, Aubrey C. LeFaive, Jonathon Denny, Joshua C. Carroll, Robert J. Altman, Russ B. Hughey, Jacob J. Zawistowski, Matthew Peterson, Josh F. Am J Hum Genet Article Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results. Elsevier 2023-09-07 2023-08-21 /pmc/articles/PMC10502848/ /pubmed/37607538 http://dx.doi.org/10.1016/j.ajhg.2023.07.012 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Bastarache, Lisa
Delozier, Sarah
Pandit, Anita
He, Jing
Lewis, Adam
Annis, Aubrey C.
LeFaive, Jonathon
Denny, Joshua C.
Carroll, Robert J.
Altman, Russ B.
Hughey, Jacob J.
Zawistowski, Matthew
Peterson, Josh F.
The phenotype-genotype reference map: Improving biobank data science through replication
title The phenotype-genotype reference map: Improving biobank data science through replication
title_full The phenotype-genotype reference map: Improving biobank data science through replication
title_fullStr The phenotype-genotype reference map: Improving biobank data science through replication
title_full_unstemmed The phenotype-genotype reference map: Improving biobank data science through replication
title_short The phenotype-genotype reference map: Improving biobank data science through replication
title_sort phenotype-genotype reference map: improving biobank data science through replication
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10502848/
https://www.ncbi.nlm.nih.gov/pubmed/37607538
http://dx.doi.org/10.1016/j.ajhg.2023.07.012
work_keys_str_mv AT bastarachelisa thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT deloziersarah thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT panditanita thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT hejing thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT lewisadam thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT annisaubreyc thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT lefaivejonathon thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT dennyjoshuac thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT carrollrobertj thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT altmanrussb thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT hugheyjacobj thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT zawistowskimatthew thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT petersonjoshf thephenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT bastarachelisa phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT deloziersarah phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT panditanita phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT hejing phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT lewisadam phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT annisaubreyc phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT lefaivejonathon phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT dennyjoshuac phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT carrollrobertj phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT altmanrussb phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT hugheyjacobj phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT zawistowskimatthew phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication
AT petersonjoshf phenotypegenotypereferencemapimprovingbiobankdatasciencethroughreplication