Cargando…

Extending the use of GWAS data by combining data from different genetic platforms

BACKGROUND: In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that a...

Descripción completa

Detalles Bibliográficos
Autores principales: van Iperen, E. P. A., Hovingh, G. K., Asselbergs, F. W., Zwinderman, A. H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5330464/
https://www.ncbi.nlm.nih.gov/pubmed/28245255
http://dx.doi.org/10.1371/journal.pone.0172082
_version_ 1782511240216576000
author van Iperen, E. P. A.
Hovingh, G. K.
Asselbergs, F. W.
Zwinderman, A. H.
author_facet van Iperen, E. P. A.
Hovingh, G. K.
Asselbergs, F. W.
Zwinderman, A. H.
author_sort van Iperen, E. P. A.
collection PubMed
description BACKGROUND: In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that are not directly genotyped. Imputation methods can also be used to combine and analyse data genotyped on different genotyping arrays. In this study we investigated the imputation quality and efficiency of two different approaches of combining GWAS data from different genotyping platforms. We investigated whether combining data from different platforms before the actual imputation performs better than combining the data from different platforms after imputation. METHODS: In total 979 unique individuals from the AMC-PAS cohort were genotyped on 3 different platforms. A total of 706 individuals were genotyped on the MetaboChip, a total of 757 individuals were genotyped on the 50K gene-centric Human CVD BeadChip, and a total of 955 individuals were genotyped on the HumanExome chip. A total of 397 individuals were genotyped on all 3 individual platforms. After pre-imputation quality control (QC), Minimac in combination with MaCH was used for the imputation of all samples with the 1,000 genomes reference panel. All imputed markers with an r(2) value of <0.3 were excluded in our post-imputation QC. RESULTS: A total of 397 individuals were genotyped on all three platforms. All three datasets were carefully matched on strand, SNP ID and genomic coordinates. This resulted in a dataset of 979 unique individuals and a total of 258,925 unique markers. A total of 4,117,036 SNPs were available when imputation was performed before merging the three datasets. A total of 3,933,494 SNPs were available when imputation was done on the combined set. Our results suggest that imputation of individual datasets before merging performs slightly better than after combining the different datasets. CONCLUSIONS: Imputation of datasets genotyped by different platforms before merging generates more SNPs than imputation after putting the datasets together.
format Online
Article
Text
id pubmed-5330464
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-53304642017-03-09 Extending the use of GWAS data by combining data from different genetic platforms van Iperen, E. P. A. Hovingh, G. K. Asselbergs, F. W. Zwinderman, A. H. PLoS One Research Article BACKGROUND: In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that are not directly genotyped. Imputation methods can also be used to combine and analyse data genotyped on different genotyping arrays. In this study we investigated the imputation quality and efficiency of two different approaches of combining GWAS data from different genotyping platforms. We investigated whether combining data from different platforms before the actual imputation performs better than combining the data from different platforms after imputation. METHODS: In total 979 unique individuals from the AMC-PAS cohort were genotyped on 3 different platforms. A total of 706 individuals were genotyped on the MetaboChip, a total of 757 individuals were genotyped on the 50K gene-centric Human CVD BeadChip, and a total of 955 individuals were genotyped on the HumanExome chip. A total of 397 individuals were genotyped on all 3 individual platforms. After pre-imputation quality control (QC), Minimac in combination with MaCH was used for the imputation of all samples with the 1,000 genomes reference panel. All imputed markers with an r(2) value of <0.3 were excluded in our post-imputation QC. RESULTS: A total of 397 individuals were genotyped on all three platforms. All three datasets were carefully matched on strand, SNP ID and genomic coordinates. This resulted in a dataset of 979 unique individuals and a total of 258,925 unique markers. A total of 4,117,036 SNPs were available when imputation was performed before merging the three datasets. A total of 3,933,494 SNPs were available when imputation was done on the combined set. Our results suggest that imputation of individual datasets before merging performs slightly better than after combining the different datasets. CONCLUSIONS: Imputation of datasets genotyped by different platforms before merging generates more SNPs than imputation after putting the datasets together. Public Library of Science 2017-02-28 /pmc/articles/PMC5330464/ /pubmed/28245255 http://dx.doi.org/10.1371/journal.pone.0172082 Text en © 2017 van Iperen et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
van Iperen, E. P. A.
Hovingh, G. K.
Asselbergs, F. W.
Zwinderman, A. H.
Extending the use of GWAS data by combining data from different genetic platforms
title Extending the use of GWAS data by combining data from different genetic platforms
title_full Extending the use of GWAS data by combining data from different genetic platforms
title_fullStr Extending the use of GWAS data by combining data from different genetic platforms
title_full_unstemmed Extending the use of GWAS data by combining data from different genetic platforms
title_short Extending the use of GWAS data by combining data from different genetic platforms
title_sort extending the use of gwas data by combining data from different genetic platforms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5330464/
https://www.ncbi.nlm.nih.gov/pubmed/28245255
http://dx.doi.org/10.1371/journal.pone.0172082
work_keys_str_mv AT vaniperenepa extendingtheuseofgwasdatabycombiningdatafromdifferentgeneticplatforms
AT hovinghgk extendingtheuseofgwasdatabycombiningdatafromdifferentgeneticplatforms
AT asselbergsfw extendingtheuseofgwasdatabycombiningdatafromdifferentgeneticplatforms
AT zwindermanah extendingtheuseofgwasdatabycombiningdatafromdifferentgeneticplatforms