Cargando…

regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data

Expression quantitative trait loci (eQTL) analysis is useful for identifying genetic variants correlated with gene expression, however, it cannot distinguish between causal and nearby non-functional variants. Because the majority of disease-associated SNPs are located in regulatory regions, they can...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Siwen, Feng, Weixing, Lu, Zixiao, Yu, Christina Y., Shao, Wei, Nakshatri, Harikrishna, Reiter, Jill L., Gao, Hongyu, Chu, Xiaona, Wang, Yue, Liu, Yunlong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7405637/
https://www.ncbi.nlm.nih.gov/pubmed/32850739
http://dx.doi.org/10.3389/fbioe.2020.00886
_version_ 1783567286548824064
author Xu, Siwen
Feng, Weixing
Lu, Zixiao
Yu, Christina Y.
Shao, Wei
Nakshatri, Harikrishna
Reiter, Jill L.
Gao, Hongyu
Chu, Xiaona
Wang, Yue
Liu, Yunlong
author_facet Xu, Siwen
Feng, Weixing
Lu, Zixiao
Yu, Christina Y.
Shao, Wei
Nakshatri, Harikrishna
Reiter, Jill L.
Gao, Hongyu
Chu, Xiaona
Wang, Yue
Liu, Yunlong
author_sort Xu, Siwen
collection PubMed
description Expression quantitative trait loci (eQTL) analysis is useful for identifying genetic variants correlated with gene expression, however, it cannot distinguish between causal and nearby non-functional variants. Because the majority of disease-associated SNPs are located in regulatory regions, they can impact allele-specific binding (ASB) of transcription factors and result in differential expression of the target gene alleles. In this study, our aim was to identify functional single-nucleotide polymorphisms (SNPs) that alter transcriptional regulation and thus, potentially impact cellular function. Here, we present regSNPs-ASB, a generalized linear model-based approach to identify regulatory SNPs that are located in transcription factor binding sites. The input for this model includes ATAC-seq (assay for transposase-accessible chromatin with high-throughput sequencing) raw read counts from heterozygous loci, where differential transposase-cleavage patterns between two alleles indicate preferential transcription factor binding to one of the alleles. Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression. By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also in eQTL regions. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In summary, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets.
format Online
Article
Text
id pubmed-7405637
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-74056372020-08-25 regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data Xu, Siwen Feng, Weixing Lu, Zixiao Yu, Christina Y. Shao, Wei Nakshatri, Harikrishna Reiter, Jill L. Gao, Hongyu Chu, Xiaona Wang, Yue Liu, Yunlong Front Bioeng Biotechnol Bioengineering and Biotechnology Expression quantitative trait loci (eQTL) analysis is useful for identifying genetic variants correlated with gene expression, however, it cannot distinguish between causal and nearby non-functional variants. Because the majority of disease-associated SNPs are located in regulatory regions, they can impact allele-specific binding (ASB) of transcription factors and result in differential expression of the target gene alleles. In this study, our aim was to identify functional single-nucleotide polymorphisms (SNPs) that alter transcriptional regulation and thus, potentially impact cellular function. Here, we present regSNPs-ASB, a generalized linear model-based approach to identify regulatory SNPs that are located in transcription factor binding sites. The input for this model includes ATAC-seq (assay for transposase-accessible chromatin with high-throughput sequencing) raw read counts from heterozygous loci, where differential transposase-cleavage patterns between two alleles indicate preferential transcription factor binding to one of the alleles. Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression. By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also in eQTL regions. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In summary, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets. Frontiers Media S.A. 2020-07-29 /pmc/articles/PMC7405637/ /pubmed/32850739 http://dx.doi.org/10.3389/fbioe.2020.00886 Text en Copyright © 2020 Xu, Feng, Lu, Yu, Shao, Nakshatri, Reiter, Gao, Chu, Wang and Liu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Xu, Siwen
Feng, Weixing
Lu, Zixiao
Yu, Christina Y.
Shao, Wei
Nakshatri, Harikrishna
Reiter, Jill L.
Gao, Hongyu
Chu, Xiaona
Wang, Yue
Liu, Yunlong
regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title_full regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title_fullStr regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title_full_unstemmed regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title_short regSNPs-ASB: A Computational Framework for Identifying Allele-Specific Transcription Factor Binding From ATAC-seq Data
title_sort regsnps-asb: a computational framework for identifying allele-specific transcription factor binding from atac-seq data
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7405637/
https://www.ncbi.nlm.nih.gov/pubmed/32850739
http://dx.doi.org/10.3389/fbioe.2020.00886
work_keys_str_mv AT xusiwen regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT fengweixing regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT luzixiao regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT yuchristinay regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT shaowei regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT nakshatriharikrishna regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT reiterjilll regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT gaohongyu regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT chuxiaona regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT wangyue regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata
AT liuyunlong regsnpsasbacomputationalframeworkforidentifyingallelespecifictranscriptionfactorbindingfromatacseqdata