Cargando…
Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification
Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an es...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3738448/ https://www.ncbi.nlm.nih.gov/pubmed/23950724 http://dx.doi.org/10.1371/journal.pgen.1003609 |
_version_ | 1782476837044092928 |
---|---|
author | Faye, Laura L. Machiela, Mitchell J. Kraft, Peter Bull, Shelley B. Sun, Lei |
author_facet | Faye, Laura L. Machiela, Mitchell J. Kraft, Peter Bull, Shelley B. Sun, Lei |
author_sort | Faye, Laura L. |
collection | PubMed |
description | Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website. |
format | Online Article Text |
id | pubmed-3738448 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37384482013-08-15 Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification Faye, Laura L. Machiela, Mitchell J. Kraft, Peter Bull, Shelley B. Sun, Lei PLoS Genet Research Article Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website. Public Library of Science 2013-08-08 /pmc/articles/PMC3738448/ /pubmed/23950724 http://dx.doi.org/10.1371/journal.pgen.1003609 Text en © 2013 Faye et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Faye, Laura L. Machiela, Mitchell J. Kraft, Peter Bull, Shelley B. Sun, Lei Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title | Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title_full | Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title_fullStr | Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title_full_unstemmed | Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title_short | Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification |
title_sort | re-ranking sequencing variants in the post-gwas era for accurate causal variant identification |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3738448/ https://www.ncbi.nlm.nih.gov/pubmed/23950724 http://dx.doi.org/10.1371/journal.pgen.1003609 |
work_keys_str_mv | AT fayelaural rerankingsequencingvariantsinthepostgwaseraforaccuratecausalvariantidentification AT machielamitchellj rerankingsequencingvariantsinthepostgwaseraforaccuratecausalvariantidentification AT kraftpeter rerankingsequencingvariantsinthepostgwaseraforaccuratecausalvariantidentification AT bullshelleyb rerankingsequencingvariantsinthepostgwaseraforaccuratecausalvariantidentification AT sunlei rerankingsequencingvariantsinthepostgwaseraforaccuratecausalvariantidentification |