Cargando…
Annotating regulatory elements by heterogeneous network embedding
MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs r...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326849/ https://www.ncbi.nlm.nih.gov/pubmed/35561169 http://dx.doi.org/10.1093/bioinformatics/btac185 |
_version_ | 1784757385491906560 |
---|---|
author | Lu, Yurun Feng, Zhanying Zhang, Songmao Wang, Yong |
author_facet | Lu, Yurun Feng, Zhanying Zhang, Songmao Wang, Yong |
author_sort | Lu, Yurun |
collection | PubMed |
description | MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts. RESULTS: We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data. AVAILABILITY AND IMPLEMENTATION: The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9326849 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-93268492022-07-28 Annotating regulatory elements by heterogeneous network embedding Lu, Yurun Feng, Zhanying Zhang, Songmao Wang, Yong Bioinformatics Original Papers MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts. RESULTS: We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data. AVAILABILITY AND IMPLEMENTATION: The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-03-24 /pmc/articles/PMC9326849/ /pubmed/35561169 http://dx.doi.org/10.1093/bioinformatics/btac185 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Lu, Yurun Feng, Zhanying Zhang, Songmao Wang, Yong Annotating regulatory elements by heterogeneous network embedding |
title | Annotating regulatory elements by heterogeneous network embedding |
title_full | Annotating regulatory elements by heterogeneous network embedding |
title_fullStr | Annotating regulatory elements by heterogeneous network embedding |
title_full_unstemmed | Annotating regulatory elements by heterogeneous network embedding |
title_short | Annotating regulatory elements by heterogeneous network embedding |
title_sort | annotating regulatory elements by heterogeneous network embedding |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326849/ https://www.ncbi.nlm.nih.gov/pubmed/35561169 http://dx.doi.org/10.1093/bioinformatics/btac185 |
work_keys_str_mv | AT luyurun annotatingregulatoryelementsbyheterogeneousnetworkembedding AT fengzhanying annotatingregulatoryelementsbyheterogeneousnetworkembedding AT zhangsongmao annotatingregulatoryelementsbyheterogeneousnetworkembedding AT wangyong annotatingregulatoryelementsbyheterogeneousnetworkembedding |