Cargando…

Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies

Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significa...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuo, Chia-Ling, Vsevolozhskaya, Olga A., Zaykin, Dmitri V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4425705/
https://www.ncbi.nlm.nih.gov/pubmed/25955023
http://dx.doi.org/10.1371/journal.pone.0124107
_version_ 1782370523806695424
author Kuo, Chia-Ling
Vsevolozhskaya, Olga A.
Zaykin, Dmitri V.
author_facet Kuo, Chia-Ling
Vsevolozhskaya, Olga A.
Zaykin, Dmitri V.
author_sort Kuo, Chia-Ling
collection PubMed
description Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease.
format Online
Article
Text
id pubmed-4425705
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44257052015-05-21 Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies Kuo, Chia-Ling Vsevolozhskaya, Olga A. Zaykin, Dmitri V. PLoS One Research Article Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease. Public Library of Science 2015-05-08 /pmc/articles/PMC4425705/ /pubmed/25955023 http://dx.doi.org/10.1371/journal.pone.0124107 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Kuo, Chia-Ling
Vsevolozhskaya, Olga A.
Zaykin, Dmitri V.
Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title_full Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title_fullStr Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title_full_unstemmed Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title_short Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies
title_sort assessing the probability that a finding is genuine for large-scale genetic association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4425705/
https://www.ncbi.nlm.nih.gov/pubmed/25955023
http://dx.doi.org/10.1371/journal.pone.0124107
work_keys_str_mv AT kuochialing assessingtheprobabilitythatafindingisgenuineforlargescalegeneticassociationstudies
AT vsevolozhskayaolgaa assessingtheprobabilitythatafindingisgenuineforlargescalegeneticassociationstudies
AT zaykindmitriv assessingtheprobabilitythatafindingisgenuineforlargescalegeneticassociationstudies