Cargando…

A modified generalized Fisher method for combining probabilities from dependent tests

Rapid developments in molecular technology have yielded a large amount of high throughput genetic data to understand the mechanism for complex traits. The increase of genetic variants requires hundreds and thousands of statistical tests to be performed simultaneously in analysis, which poses a chall...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dai, Hongying, Leeder, J. Steven, Cui, Yuehua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2014
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3929847/ https://www.ncbi.nlm.nih.gov/pubmed/24600471 http://dx.doi.org/10.3389/fgene.2014.00032

_version_	1782304457152790528
author	Dai, Hongying Leeder, J. Steven Cui, Yuehua
author_facet	Dai, Hongying Leeder, J. Steven Cui, Yuehua
author_sort	Dai, Hongying
collection	PubMed
description	Rapid developments in molecular technology have yielded a large amount of high throughput genetic data to understand the mechanism for complex traits. The increase of genetic variants requires hundreds and thousands of statistical tests to be performed simultaneously in analysis, which poses a challenge to control the overall Type I error rate. Combining p-values from multiple hypothesis testing has shown promise for aggregating effects in high-dimensional genetic data analysis. Several p-value combining methods have been developed and applied to genetic data; see Dai et al. (2012b) for a comprehensive review. However, there is a lack of investigations conducted for dependent genetic data, especially for weighted p-value combining methods. Single nucleotide polymorphisms (SNPs) are often correlated due to linkage disequilibrium (LD). Other genetic data, including variants from next generation sequencing, gene expression levels measured by microarray, protein and DNA methylation data, etc. also contain complex correlation structures. Ignoring correlation structures among genetic variants may lead to severe inflation of Type I error rates for omnibus testing of p-values. In this work, we propose modifications to the Lancaster procedure by taking the correlation structure among p-values into account. The weight function in the Lancaster procedure allows meaningful biological information to be incorporated into the statistical analysis, which can increase the power of the statistical testing and/or remove the bias in the process. Extensive empirical assessments demonstrate that the modified Lancaster procedure largely reduces the Type I error rates due to correlation among p-values, and retains considerable power to detect signals among p-values. We applied our method to reassess published renal transplant data, and identified a novel association between B cell pathways and allograft tolerance.
format	Online Article Text
id	pubmed-3929847
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-39298472014-03-05 A modified generalized Fisher method for combining probabilities from dependent tests Dai, Hongying Leeder, J. Steven Cui, Yuehua Front Genet Genetics Rapid developments in molecular technology have yielded a large amount of high throughput genetic data to understand the mechanism for complex traits. The increase of genetic variants requires hundreds and thousands of statistical tests to be performed simultaneously in analysis, which poses a challenge to control the overall Type I error rate. Combining p-values from multiple hypothesis testing has shown promise for aggregating effects in high-dimensional genetic data analysis. Several p-value combining methods have been developed and applied to genetic data; see Dai et al. (2012b) for a comprehensive review. However, there is a lack of investigations conducted for dependent genetic data, especially for weighted p-value combining methods. Single nucleotide polymorphisms (SNPs) are often correlated due to linkage disequilibrium (LD). Other genetic data, including variants from next generation sequencing, gene expression levels measured by microarray, protein and DNA methylation data, etc. also contain complex correlation structures. Ignoring correlation structures among genetic variants may lead to severe inflation of Type I error rates for omnibus testing of p-values. In this work, we propose modifications to the Lancaster procedure by taking the correlation structure among p-values into account. The weight function in the Lancaster procedure allows meaningful biological information to be incorporated into the statistical analysis, which can increase the power of the statistical testing and/or remove the bias in the process. Extensive empirical assessments demonstrate that the modified Lancaster procedure largely reduces the Type I error rates due to correlation among p-values, and retains considerable power to detect signals among p-values. We applied our method to reassess published renal transplant data, and identified a novel association between B cell pathways and allograft tolerance. Frontiers Media S.A. 2014-02-20 /pmc/articles/PMC3929847/ /pubmed/24600471 http://dx.doi.org/10.3389/fgene.2014.00032 Text en Copyright © 2014 Dai, Leeder and Cui. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Dai, Hongying Leeder, J. Steven Cui, Yuehua A modified generalized Fisher method for combining probabilities from dependent tests
title	A modified generalized Fisher method for combining probabilities from dependent tests
title_full	A modified generalized Fisher method for combining probabilities from dependent tests
title_fullStr	A modified generalized Fisher method for combining probabilities from dependent tests
title_full_unstemmed	A modified generalized Fisher method for combining probabilities from dependent tests
title_short	A modified generalized Fisher method for combining probabilities from dependent tests
title_sort	modified generalized fisher method for combining probabilities from dependent tests
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3929847/ https://www.ncbi.nlm.nih.gov/pubmed/24600471 http://dx.doi.org/10.3389/fgene.2014.00032
work_keys_str_mv	AT daihongying amodifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests AT leederjsteven amodifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests AT cuiyuehua amodifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests AT daihongying modifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests AT leederjsteven modifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests AT cuiyuehua modifiedgeneralizedfishermethodforcombiningprobabilitiesfromdependenttests

A modified generalized Fisher method for combining probabilities from dependent tests

Ejemplares similares