Cargando…
rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will he...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311921/ https://www.ncbi.nlm.nih.gov/pubmed/30598086 http://dx.doi.org/10.1186/s12918-018-0661-z |
_version_ | 1783383701841772544 |
---|---|
author | Cai, Menglan Li, Limin |
author_facet | Cai, Menglan Li, Limin |
author_sort | Cai, Menglan |
collection | PubMed |
description | BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will help determine whether T cells could reprogram or not, and further help design the cancer treatment strategy. Some existing p-value combination methods can be used in this scenario. However, these methods suffer from different disadvantages, and thus it is still challenging to design more powerful and robust statistical method. RESULTS: The existing method of Group combined p-value (GCP) first partitions p-values to several groups using a set of several truncation points, but the method is often sensitive to these truncation points. Another method of adaptive rank truncated product method(ARTP) makes use of multiple truncation integers to adaptively combine the smallest p-values, but the method loses statistical power since it ignores the larger p-values. To tackle these problems, we propose a robust p-value combination method (rPCMP) by considering multiple partitions of p-values with different sets of truncation points. The proposed rPCMP statistic have a three-layer hierarchical structure. The inner-layer considers a statistic which combines p-values in a specified interval defined by two thresholds points, the intermediate-layer uses a GCP statistic which optimizes the statistic from the inner layer for a partition set of threshold points, and the outer-layer integrates the GCP statistic from multiple partitions of p-values. The empirical distribution of statistic under null distribution could be estimated by permutation procedure. CONCLUSIONS: Our proposed rPCMP method has been shown to be more robust and have higher statistical power. Simulation study shows that our method can effectively control the type I error rates and have higher statistical power than the existing methods. We finally apply our rPCMP method to an ATAC-seq dataset for discovering the related gene functions with chromatin states in mouse tumors T cell. |
format | Online Article Text |
id | pubmed-6311921 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63119212019-01-07 rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data Cai, Menglan Li, Limin BMC Syst Biol Research BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will help determine whether T cells could reprogram or not, and further help design the cancer treatment strategy. Some existing p-value combination methods can be used in this scenario. However, these methods suffer from different disadvantages, and thus it is still challenging to design more powerful and robust statistical method. RESULTS: The existing method of Group combined p-value (GCP) first partitions p-values to several groups using a set of several truncation points, but the method is often sensitive to these truncation points. Another method of adaptive rank truncated product method(ARTP) makes use of multiple truncation integers to adaptively combine the smallest p-values, but the method loses statistical power since it ignores the larger p-values. To tackle these problems, we propose a robust p-value combination method (rPCMP) by considering multiple partitions of p-values with different sets of truncation points. The proposed rPCMP statistic have a three-layer hierarchical structure. The inner-layer considers a statistic which combines p-values in a specified interval defined by two thresholds points, the intermediate-layer uses a GCP statistic which optimizes the statistic from the inner layer for a partition set of threshold points, and the outer-layer integrates the GCP statistic from multiple partitions of p-values. The empirical distribution of statistic under null distribution could be estimated by permutation procedure. CONCLUSIONS: Our proposed rPCMP method has been shown to be more robust and have higher statistical power. Simulation study shows that our method can effectively control the type I error rates and have higher statistical power than the existing methods. We finally apply our rPCMP method to an ATAC-seq dataset for discovering the related gene functions with chromatin states in mouse tumors T cell. BioMed Central 2018-12-31 /pmc/articles/PMC6311921/ /pubmed/30598086 http://dx.doi.org/10.1186/s12918-018-0661-z Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Cai, Menglan Li, Limin rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title | rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title_full | rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title_fullStr | rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title_full_unstemmed | rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title_short | rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data |
title_sort | rpcmp: robust p-value combination by multiple partitions with applications to atac-seq data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311921/ https://www.ncbi.nlm.nih.gov/pubmed/30598086 http://dx.doi.org/10.1186/s12918-018-0661-z |
work_keys_str_mv | AT caimenglan rpcmprobustpvaluecombinationbymultiplepartitionswithapplicationstoatacseqdata AT lilimin rpcmprobustpvaluecombinationbymultiplepartitionswithapplicationstoatacseqdata |