Cargando…

An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies

Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using t...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Hongying, Wu, Guodong, Wu, Michael, Zhi, Degui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4933358/
https://www.ncbi.nlm.nih.gov/pubmed/27380176
http://dx.doi.org/10.1371/journal.pone.0152667
_version_ 1782441161359622144
author Dai, Hongying
Wu, Guodong
Wu, Michael
Zhi, Degui
author_facet Dai, Hongying
Wu, Guodong
Wu, Michael
Zhi, Degui
author_sort Dai, Hongying
collection PubMed
description Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Image: see text] , compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Image: see text] ). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.
format Online
Article
Text
id pubmed-4933358
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49333582016-07-18 An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies Dai, Hongying Wu, Guodong Wu, Michael Zhi, Degui PLoS One Research Article Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Image: see text] , compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Image: see text] ). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol. Public Library of Science 2016-07-05 /pmc/articles/PMC4933358/ /pubmed/27380176 http://dx.doi.org/10.1371/journal.pone.0152667 Text en © 2016 Dai et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Dai, Hongying
Wu, Guodong
Wu, Michael
Zhi, Degui
An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title_full An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title_fullStr An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title_full_unstemmed An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title_short An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
title_sort optimal bahadur-efficient method in detection of sparse signals with applications to pathway analysis in sequencing association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4933358/
https://www.ncbi.nlm.nih.gov/pubmed/27380176
http://dx.doi.org/10.1371/journal.pone.0152667
work_keys_str_mv AT daihongying anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT wuguodong anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT wumichael anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT zhidegui anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT daihongying optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT wuguodong optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT wumichael optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies
AT zhidegui optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies