Cargando…

A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana

Transcription factors are an integral component of the cellular machinery responsible for regulating many biological processes, and they recognize distinct DNA sequence patterns as well as internal/external signals to mediate target gene expression. The functional roles of an individual transcriptio...

Descripción completa

Detalles Bibliográficos
Autores principales: Najnin, Tanzira, Saimon, Sakhawat Hossain, Sunter, Garry, Ruan, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9957447/
https://www.ncbi.nlm.nih.gov/pubmed/36833209
http://dx.doi.org/10.3390/genes14020282
_version_ 1784894828562087936
author Najnin, Tanzira
Saimon, Sakhawat Hossain
Sunter, Garry
Ruan, Jianhua
author_facet Najnin, Tanzira
Saimon, Sakhawat Hossain
Sunter, Garry
Ruan, Jianhua
author_sort Najnin, Tanzira
collection PubMed
description Transcription factors are an integral component of the cellular machinery responsible for regulating many biological processes, and they recognize distinct DNA sequence patterns as well as internal/external signals to mediate target gene expression. The functional roles of an individual transcription factor can be traced back to the functions of its target genes. While such functional associations can be inferred through the use of binding evidence from high-throughput sequencing technologies available today, including chromatin immunoprecipitation sequencing, such experiments can be resource-consuming. On the other hand, exploratory analysis driven by computational techniques can alleviate this burden by narrowing the search scope, but the results are often deemed low-quality or non-specific by biologists. In this paper, we introduce a data-driven, statistics-based strategy to predict novel functional associations for transcription factors in the model plant Arabidopsis thaliana. To achieve this, we leverage one of the largest available gene expression compendia to build a genome-wide transcriptional regulatory network and infer regulatory relationships among transcription factors and their targets. We then use this network to build a pool of likely downstream targets for each transcription factor and query each target pool for functionally enriched gene ontology terms. The results exhibited sufficient statistical significance to annotate most of the transcription factors in Arabidopsis with highly specific biological processes. We also perform DNA binding motif discovery for transcription factors based on their target pool. We show that the predicted functions and motifs strongly agree with curated databases constructed from experimental evidence. In addition, statistical analysis of the network revealed interesting patterns and connections between network topology and system-level transcriptional regulation properties. We believe that the methods demonstrated in this work can be extended to other species to improve the annotation of transcription factors and understand transcriptional regulation on a system level.
format Online
Article
Text
id pubmed-9957447
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99574472023-02-25 A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana Najnin, Tanzira Saimon, Sakhawat Hossain Sunter, Garry Ruan, Jianhua Genes (Basel) Article Transcription factors are an integral component of the cellular machinery responsible for regulating many biological processes, and they recognize distinct DNA sequence patterns as well as internal/external signals to mediate target gene expression. The functional roles of an individual transcription factor can be traced back to the functions of its target genes. While such functional associations can be inferred through the use of binding evidence from high-throughput sequencing technologies available today, including chromatin immunoprecipitation sequencing, such experiments can be resource-consuming. On the other hand, exploratory analysis driven by computational techniques can alleviate this burden by narrowing the search scope, but the results are often deemed low-quality or non-specific by biologists. In this paper, we introduce a data-driven, statistics-based strategy to predict novel functional associations for transcription factors in the model plant Arabidopsis thaliana. To achieve this, we leverage one of the largest available gene expression compendia to build a genome-wide transcriptional regulatory network and infer regulatory relationships among transcription factors and their targets. We then use this network to build a pool of likely downstream targets for each transcription factor and query each target pool for functionally enriched gene ontology terms. The results exhibited sufficient statistical significance to annotate most of the transcription factors in Arabidopsis with highly specific biological processes. We also perform DNA binding motif discovery for transcription factors based on their target pool. We show that the predicted functions and motifs strongly agree with curated databases constructed from experimental evidence. In addition, statistical analysis of the network revealed interesting patterns and connections between network topology and system-level transcriptional regulation properties. We believe that the methods demonstrated in this work can be extended to other species to improve the annotation of transcription factors and understand transcriptional regulation on a system level. MDPI 2023-01-21 /pmc/articles/PMC9957447/ /pubmed/36833209 http://dx.doi.org/10.3390/genes14020282 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Najnin, Tanzira
Saimon, Sakhawat Hossain
Sunter, Garry
Ruan, Jianhua
A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title_full A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title_fullStr A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title_full_unstemmed A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title_short A Network-Based Approach for Improving Annotation of Transcription Factor Functions and Binding Sites in Arabidopsis thaliana
title_sort network-based approach for improving annotation of transcription factor functions and binding sites in arabidopsis thaliana
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9957447/
https://www.ncbi.nlm.nih.gov/pubmed/36833209
http://dx.doi.org/10.3390/genes14020282
work_keys_str_mv AT najnintanzira anetworkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT saimonsakhawathossain anetworkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT suntergarry anetworkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT ruanjianhua anetworkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT najnintanzira networkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT saimonsakhawathossain networkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT suntergarry networkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana
AT ruanjianhua networkbasedapproachforimprovingannotationoftranscriptionfactorfunctionsandbindingsitesinarabidopsisthaliana