Cargando…

Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing

Large sets of genomic regions are generated by the initial analysis of various genome-wide sequencing data, such as ChIP-seq and ATAC-seq experiments. Gene set enrichment (GSE) methods are commonly employed to determine the pathways associated with them. Given the pathways and other gene sets (e.g.,...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Christopher, Wang, Kai, Qin, Tingting, Sartor, Maureen A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7069355/
https://www.ncbi.nlm.nih.gov/pubmed/32211031
http://dx.doi.org/10.3389/fgene.2020.00199
_version_ 1783505763392552960
author Lee, Christopher
Wang, Kai
Qin, Tingting
Sartor, Maureen A.
author_facet Lee, Christopher
Wang, Kai
Qin, Tingting
Sartor, Maureen A.
author_sort Lee, Christopher
collection PubMed
description Large sets of genomic regions are generated by the initial analysis of various genome-wide sequencing data, such as ChIP-seq and ATAC-seq experiments. Gene set enrichment (GSE) methods are commonly employed to determine the pathways associated with them. Given the pathways and other gene sets (e.g., GO terms) of significance, it is of great interest to know the extent to which each is driven by binding near transcription start sites (TSS) or near enhancers. Currently, no tool performs such an analysis. Here, we present a method that addresses this question to complement GSE methods for genomic regions. Specifically, the new method tests whether the genomic regions in a gene set are significantly closer to a TSS (or to an enhancer) than expected by chance given the total list of genomic regions, using a non-parametric test. Combining the results from a GSE test with our novel method provides additional information regarding the mode of regulation of each pathway, and additional evidence that the pathway is truly enriched. We illustrate our new method with a large set of ENCODE ChIP-seq data, using the chipenrich Bioconductor package. The results show that our method is a powerful complementary approach to help researchers interpret large sets of genomic regions.
format Online
Article
Text
id pubmed-7069355
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70693552020-03-24 Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing Lee, Christopher Wang, Kai Qin, Tingting Sartor, Maureen A. Front Genet Genetics Large sets of genomic regions are generated by the initial analysis of various genome-wide sequencing data, such as ChIP-seq and ATAC-seq experiments. Gene set enrichment (GSE) methods are commonly employed to determine the pathways associated with them. Given the pathways and other gene sets (e.g., GO terms) of significance, it is of great interest to know the extent to which each is driven by binding near transcription start sites (TSS) or near enhancers. Currently, no tool performs such an analysis. Here, we present a method that addresses this question to complement GSE methods for genomic regions. Specifically, the new method tests whether the genomic regions in a gene set are significantly closer to a TSS (or to an enhancer) than expected by chance given the total list of genomic regions, using a non-parametric test. Combining the results from a GSE test with our novel method provides additional information regarding the mode of regulation of each pathway, and additional evidence that the pathway is truly enriched. We illustrate our new method with a large set of ENCODE ChIP-seq data, using the chipenrich Bioconductor package. The results show that our method is a powerful complementary approach to help researchers interpret large sets of genomic regions. Frontiers Media S.A. 2020-03-06 /pmc/articles/PMC7069355/ /pubmed/32211031 http://dx.doi.org/10.3389/fgene.2020.00199 Text en Copyright © 2020 Lee, Wang, Qin and Sartor. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Lee, Christopher
Wang, Kai
Qin, Tingting
Sartor, Maureen A.
Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title_full Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title_fullStr Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title_full_unstemmed Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title_short Testing Proximity of Genomic Regions to Transcription Start Sites and Enhancers Complements Gene Set Enrichment Testing
title_sort testing proximity of genomic regions to transcription start sites and enhancers complements gene set enrichment testing
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7069355/
https://www.ncbi.nlm.nih.gov/pubmed/32211031
http://dx.doi.org/10.3389/fgene.2020.00199
work_keys_str_mv AT leechristopher testingproximityofgenomicregionstotranscriptionstartsitesandenhancerscomplementsgenesetenrichmenttesting
AT wangkai testingproximityofgenomicregionstotranscriptionstartsitesandenhancerscomplementsgenesetenrichmenttesting
AT qintingting testingproximityofgenomicregionstotranscriptionstartsitesandenhancerscomplementsgenesetenrichmenttesting
AT sartormaureena testingproximityofgenomicregionstotranscriptionstartsitesandenhancerscomplementsgenesetenrichmenttesting