Cargando…
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks
Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing)...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876938/ https://www.ncbi.nlm.nih.gov/pubmed/36697501 http://dx.doi.org/10.1038/s42003-023-04477-y |
_version_ | 1784878274456846336 |
---|---|
author | Appadurai, Vivek Bybjerg-Grauholm, Jonas Krebs, Morten Dybdahl Rosengren, Anders Buil, Alfonso Ingason, Andrés Mors, Ole Børglum, Anders D. Hougaard, David M. Nordentoft, Merete Mortensen, Preben B. Delaneau, Olivier Werge, Thomas Schork, Andrew J. |
author_facet | Appadurai, Vivek Bybjerg-Grauholm, Jonas Krebs, Morten Dybdahl Rosengren, Anders Buil, Alfonso Ingason, Andrés Mors, Ole Børglum, Anders D. Hougaard, David M. Nordentoft, Merete Mortensen, Preben B. Delaneau, Olivier Werge, Thomas Schork, Andrew J. |
author_sort | Appadurai, Vivek |
collection | PubMed |
description | Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks. |
format | Online Article Text |
id | pubmed-9876938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-98769382023-01-27 Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks Appadurai, Vivek Bybjerg-Grauholm, Jonas Krebs, Morten Dybdahl Rosengren, Anders Buil, Alfonso Ingason, Andrés Mors, Ole Børglum, Anders D. Hougaard, David M. Nordentoft, Merete Mortensen, Preben B. Delaneau, Olivier Werge, Thomas Schork, Andrew J. Commun Biol Article Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks. Nature Publishing Group UK 2023-01-26 /pmc/articles/PMC9876938/ /pubmed/36697501 http://dx.doi.org/10.1038/s42003-023-04477-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Appadurai, Vivek Bybjerg-Grauholm, Jonas Krebs, Morten Dybdahl Rosengren, Anders Buil, Alfonso Ingason, Andrés Mors, Ole Børglum, Anders D. Hougaard, David M. Nordentoft, Merete Mortensen, Preben B. Delaneau, Olivier Werge, Thomas Schork, Andrew J. Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title | Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title_full | Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title_fullStr | Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title_full_unstemmed | Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title_short | Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
title_sort | accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876938/ https://www.ncbi.nlm.nih.gov/pubmed/36697501 http://dx.doi.org/10.1038/s42003-023-04477-y |
work_keys_str_mv | AT appaduraivivek accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT bybjerggrauholmjonas accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT krebsmortendybdahl accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT rosengrenanders accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT builalfonso accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT ingasonandres accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT morsole accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT børglumandersd accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT hougaarddavidm accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT nordentoftmerete accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT mortensenprebenb accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT delaneauolivier accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT wergethomas accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks AT schorkandrewj accuracyofhaplotypeestimationandwholegenomeimputationaffectscomplextraitanalysesincomplexbiobanks |