Cargando…
An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies
Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4933358/ https://www.ncbi.nlm.nih.gov/pubmed/27380176 http://dx.doi.org/10.1371/journal.pone.0152667 |
_version_ | 1782441161359622144 |
---|---|
author | Dai, Hongying Wu, Guodong Wu, Michael Zhi, Degui |
author_facet | Dai, Hongying Wu, Guodong Wu, Michael Zhi, Degui |
author_sort | Dai, Hongying |
collection | PubMed |
description | Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Image: see text] , compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Image: see text] ). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol. |
format | Online Article Text |
id | pubmed-4933358 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-49333582016-07-18 An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies Dai, Hongying Wu, Guodong Wu, Michael Zhi, Degui PLoS One Research Article Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker—single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Image: see text] , compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Image: see text] ). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol. Public Library of Science 2016-07-05 /pmc/articles/PMC4933358/ /pubmed/27380176 http://dx.doi.org/10.1371/journal.pone.0152667 Text en © 2016 Dai et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Dai, Hongying Wu, Guodong Wu, Michael Zhi, Degui An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title | An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title_full | An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title_fullStr | An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title_full_unstemmed | An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title_short | An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies |
title_sort | optimal bahadur-efficient method in detection of sparse signals with applications to pathway analysis in sequencing association studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4933358/ https://www.ncbi.nlm.nih.gov/pubmed/27380176 http://dx.doi.org/10.1371/journal.pone.0152667 |
work_keys_str_mv | AT daihongying anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT wuguodong anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT wumichael anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT zhidegui anoptimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT daihongying optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT wuguodong optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT wumichael optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies AT zhidegui optimalbahadurefficientmethodindetectionofsparsesignalswithapplicationstopathwayanalysisinsequencingassociationstudies |