Cargando…

rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data

BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will he...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Menglan, Li, Limin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311921/
https://www.ncbi.nlm.nih.gov/pubmed/30598086
http://dx.doi.org/10.1186/s12918-018-0661-z
_version_ 1783383701841772544
author Cai, Menglan
Li, Limin
author_facet Cai, Menglan
Li, Limin
author_sort Cai, Menglan
collection PubMed
description BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will help determine whether T cells could reprogram or not, and further help design the cancer treatment strategy. Some existing p-value combination methods can be used in this scenario. However, these methods suffer from different disadvantages, and thus it is still challenging to design more powerful and robust statistical method. RESULTS: The existing method of Group combined p-value (GCP) first partitions p-values to several groups using a set of several truncation points, but the method is often sensitive to these truncation points. Another method of adaptive rank truncated product method(ARTP) makes use of multiple truncation integers to adaptively combine the smallest p-values, but the method loses statistical power since it ignores the larger p-values. To tackle these problems, we propose a robust p-value combination method (rPCMP) by considering multiple partitions of p-values with different sets of truncation points. The proposed rPCMP statistic have a three-layer hierarchical structure. The inner-layer considers a statistic which combines p-values in a specified interval defined by two thresholds points, the intermediate-layer uses a GCP statistic which optimizes the statistic from the inner layer for a partition set of threshold points, and the outer-layer integrates the GCP statistic from multiple partitions of p-values. The empirical distribution of statistic under null distribution could be estimated by permutation procedure. CONCLUSIONS: Our proposed rPCMP method has been shown to be more robust and have higher statistical power. Simulation study shows that our method can effectively control the type I error rates and have higher statistical power than the existing methods. We finally apply our rPCMP method to an ATAC-seq dataset for discovering the related gene functions with chromatin states in mouse tumors T cell.
format Online
Article
Text
id pubmed-6311921
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63119212019-01-07 rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data Cai, Menglan Li, Limin BMC Syst Biol Research BACKGROUND: Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will help determine whether T cells could reprogram or not, and further help design the cancer treatment strategy. Some existing p-value combination methods can be used in this scenario. However, these methods suffer from different disadvantages, and thus it is still challenging to design more powerful and robust statistical method. RESULTS: The existing method of Group combined p-value (GCP) first partitions p-values to several groups using a set of several truncation points, but the method is often sensitive to these truncation points. Another method of adaptive rank truncated product method(ARTP) makes use of multiple truncation integers to adaptively combine the smallest p-values, but the method loses statistical power since it ignores the larger p-values. To tackle these problems, we propose a robust p-value combination method (rPCMP) by considering multiple partitions of p-values with different sets of truncation points. The proposed rPCMP statistic have a three-layer hierarchical structure. The inner-layer considers a statistic which combines p-values in a specified interval defined by two thresholds points, the intermediate-layer uses a GCP statistic which optimizes the statistic from the inner layer for a partition set of threshold points, and the outer-layer integrates the GCP statistic from multiple partitions of p-values. The empirical distribution of statistic under null distribution could be estimated by permutation procedure. CONCLUSIONS: Our proposed rPCMP method has been shown to be more robust and have higher statistical power. Simulation study shows that our method can effectively control the type I error rates and have higher statistical power than the existing methods. We finally apply our rPCMP method to an ATAC-seq dataset for discovering the related gene functions with chromatin states in mouse tumors T cell. BioMed Central 2018-12-31 /pmc/articles/PMC6311921/ /pubmed/30598086 http://dx.doi.org/10.1186/s12918-018-0661-z Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Cai, Menglan
Li, Limin
rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title_full rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title_fullStr rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title_full_unstemmed rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title_short rPCMP: robust p-value combination by multiple partitions with applications to ATAC-seq data
title_sort rpcmp: robust p-value combination by multiple partitions with applications to atac-seq data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311921/
https://www.ncbi.nlm.nih.gov/pubmed/30598086
http://dx.doi.org/10.1186/s12918-018-0661-z
work_keys_str_mv AT caimenglan rpcmprobustpvaluecombinationbymultiplepartitionswithapplicationstoatacseqdata
AT lilimin rpcmprobustpvaluecombinationbymultiplepartitionswithapplicationstoatacseqdata