Cargando…

GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies

BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the stat...

Descripción completa

Detalles Bibliográficos
Autores principales: Sulovari, Arvis, Li, Dawei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223508/
https://www.ncbi.nlm.nih.gov/pubmed/25038819
http://dx.doi.org/10.1186/1471-2164-15-610
_version_ 1782343205580177408
author Sulovari, Arvis
Li, Dawei
author_facet Sulovari, Arvis
Li, Dawei
author_sort Sulovari, Arvis
collection PubMed
description BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. RESULTS: In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. CONCLUSION: GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. GACT SOFTWARE: http://www.uvm.edu/genomics/software/gact
format Online
Article
Text
id pubmed-4223508
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42235082014-11-08 GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies Sulovari, Arvis Li, Dawei BMC Genomics Software BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. RESULTS: In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. CONCLUSION: GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. GACT SOFTWARE: http://www.uvm.edu/genomics/software/gact BioMed Central 2014-07-19 /pmc/articles/PMC4223508/ /pubmed/25038819 http://dx.doi.org/10.1186/1471-2164-15-610 Text en Copyright © 2014 Sulovari and Li; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Sulovari, Arvis
Li, Dawei
GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title_full GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title_fullStr GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title_full_unstemmed GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title_short GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
title_sort gact: a genome build and allele definition conversion tool for snp imputation and meta-analysis in genetic association studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223508/
https://www.ncbi.nlm.nih.gov/pubmed/25038819
http://dx.doi.org/10.1186/1471-2164-15-610
work_keys_str_mv AT sulovariarvis gactagenomebuildandalleledefinitionconversiontoolforsnpimputationandmetaanalysisingeneticassociationstudies
AT lidawei gactagenomebuildandalleledefinitionconversiontoolforsnpimputationandmetaanalysisingeneticassociationstudies