Cargando…

Annotating regulatory elements by heterogeneous network embedding

MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs r...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Yurun, Feng, Zhanying, Zhang, Songmao, Wang, Yong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326849/
https://www.ncbi.nlm.nih.gov/pubmed/35561169
http://dx.doi.org/10.1093/bioinformatics/btac185
_version_ 1784757385491906560
author Lu, Yurun
Feng, Zhanying
Zhang, Songmao
Wang, Yong
author_facet Lu, Yurun
Feng, Zhanying
Zhang, Songmao
Wang, Yong
author_sort Lu, Yurun
collection PubMed
description MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts. RESULTS: We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data. AVAILABILITY AND IMPLEMENTATION: The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9326849
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93268492022-07-28 Annotating regulatory elements by heterogeneous network embedding Lu, Yurun Feng, Zhanying Zhang, Songmao Wang, Yong Bioinformatics Original Papers MOTIVATION: Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts. RESULTS: We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data. AVAILABILITY AND IMPLEMENTATION: The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-03-24 /pmc/articles/PMC9326849/ /pubmed/35561169 http://dx.doi.org/10.1093/bioinformatics/btac185 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Lu, Yurun
Feng, Zhanying
Zhang, Songmao
Wang, Yong
Annotating regulatory elements by heterogeneous network embedding
title Annotating regulatory elements by heterogeneous network embedding
title_full Annotating regulatory elements by heterogeneous network embedding
title_fullStr Annotating regulatory elements by heterogeneous network embedding
title_full_unstemmed Annotating regulatory elements by heterogeneous network embedding
title_short Annotating regulatory elements by heterogeneous network embedding
title_sort annotating regulatory elements by heterogeneous network embedding
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9326849/
https://www.ncbi.nlm.nih.gov/pubmed/35561169
http://dx.doi.org/10.1093/bioinformatics/btac185
work_keys_str_mv AT luyurun annotatingregulatoryelementsbyheterogeneousnetworkembedding
AT fengzhanying annotatingregulatoryelementsbyheterogeneousnetworkembedding
AT zhangsongmao annotatingregulatoryelementsbyheterogeneousnetworkembedding
AT wangyong annotatingregulatoryelementsbyheterogeneousnetworkembedding