Cargando…

Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks

Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing)...

Descripción completa

Detalles Bibliográficos
Autores principales: Appadurai, Vivek, Bybjerg-Grauholm, Jonas, Krebs, Morten Dybdahl, Rosengren, Anders, Buil, Alfonso, Ingason, Andrés, Mors, Ole, Børglum, Anders D., Hougaard, David M., Nordentoft, Merete, Mortensen, Preben B., Delaneau, Olivier, Werge, Thomas, Schork, Andrew J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876938/
https://www.ncbi.nlm.nih.gov/pubmed/36697501
http://dx.doi.org/10.1038/s42003-023-04477-y
_version_ 1784878274456846336
author Appadurai, Vivek
Bybjerg-Grauholm, Jonas
Krebs, Morten Dybdahl
Rosengren, Anders
Buil, Alfonso
Ingason, Andrés
Mors, Ole
Børglum, Anders D.
Hougaard, David M.
Nordentoft, Merete
Mortensen, Preben B.
Delaneau, Olivier
Werge, Thomas
Schork, Andrew J.
author_facet Appadurai, Vivek
Bybjerg-Grauholm, Jonas
Krebs, Morten Dybdahl
Rosengren, Anders
Buil, Alfonso
Ingason, Andrés
Mors, Ole
Børglum, Anders D.
Hougaard, David M.
Nordentoft, Merete
Mortensen, Preben B.
Delaneau, Olivier
Werge, Thomas
Schork, Andrew J.
author_sort Appadurai, Vivek
collection PubMed
description Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.
format Online
Article
Text
id pubmed-9876938
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-98769382023-01-27 Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks Appadurai, Vivek Bybjerg-Grauholm, Jonas Krebs, Morten Dybdahl Rosengren, Anders Buil, Alfonso Ingason, Andrés Mors, Ole Børglum, Anders D. Hougaard, David M. Nordentoft, Merete Mortensen, Preben B. Delaneau, Olivier Werge, Thomas Schork, Andrew J. Commun Biol Article Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks. Nature Publishing Group UK 2023-01-26 /pmc/articles/PMC9876938/ /pubmed/36697501 http://dx.doi.org/10.1038/s42003-023-04477-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Appadurai, Vivek
Bybjerg-Grauholm, Jonas
Krebs, Morten Dybdahl
Rosengren, Anders
Buil, Alfonso
Ingason, Andrés
Mors, Ole
Børglum, Anders D.
Hougaard, David M.
Nordentoft, Merete
Mortensen, Preben B.
Delaneau, Olivier
Werge, Thomas
Schork, Andrew J.
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title_full Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title_fullStr Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title_full_unstemmed Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title_short Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
title_sort accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876938/
https://www.ncbi.nlm.nih.gov/pubmed/36697501
http://dx.doi.org/10.1038/s42003-023-04477-y
work_keys_str_mv AT appaduraivivek accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT bybjerggrauholmjonas accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT krebsmortendybdahl accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT rosengrenanders accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT builalfonso accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT ingasonandres accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT morsole accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT børglumandersd accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT hougaarddavidm accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT nordentoftmerete accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT mortensenprebenb accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT delaneauolivier accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT wergethomas accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks
AT schorkandrewj accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks