Cargando…
GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies
BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the stat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223508/ https://www.ncbi.nlm.nih.gov/pubmed/25038819 http://dx.doi.org/10.1186/1471-2164-15-610 |
_version_ | 1782343205580177408 |
---|---|
author | Sulovari, Arvis Li, Dawei |
author_facet | Sulovari, Arvis Li, Dawei |
author_sort | Sulovari, Arvis |
collection | PubMed |
description | BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. RESULTS: In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. CONCLUSION: GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. GACT SOFTWARE: http://www.uvm.edu/genomics/software/gact |
format | Online Article Text |
id | pubmed-4223508 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42235082014-11-08 GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies Sulovari, Arvis Li, Dawei BMC Genomics Software BACKGROUND: Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. RESULTS: In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. CONCLUSION: GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. GACT SOFTWARE: http://www.uvm.edu/genomics/software/gact BioMed Central 2014-07-19 /pmc/articles/PMC4223508/ /pubmed/25038819 http://dx.doi.org/10.1186/1471-2164-15-610 Text en Copyright © 2014 Sulovari and Li; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Sulovari, Arvis Li, Dawei GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title | GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title_full | GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title_fullStr | GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title_full_unstemmed | GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title_short | GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies |
title_sort | gact: a genome build and allele definition conversion tool for snp imputation and meta-analysis in genetic association studies |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4223508/ https://www.ncbi.nlm.nih.gov/pubmed/25038819 http://dx.doi.org/10.1186/1471-2164-15-610 |
work_keys_str_mv | AT sulovariarvis gactagenomebuildandalleledefinitionconversiontoolforsnpimputationandmetaanalysisingeneticassociationstudies AT lidawei gactagenomebuildandalleledefinitionconversiontoolforsnpimputationandmetaanalysisingeneticassociationstudies |