Cargando…

TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes

BACKGROUND: Transcription factors (TFs) are the upstream regulators that orchestrate gene expression, and therefore a centrepiece in bioinformatics studies. While a core strategy to understand the biological context of genes and proteins includes annotation enrichment analysis, such as Gene Ontology...

Descripción completa

Detalles Bibliográficos
Autores principales: Magnusson, Rasmus, Lubovac-Pilav, Zelmina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8444601/
https://www.ncbi.nlm.nih.gov/pubmed/34530727
http://dx.doi.org/10.1186/s12859-021-04357-4
_version_ 1784568531144146944
author Magnusson, Rasmus
Lubovac-Pilav, Zelmina
author_facet Magnusson, Rasmus
Lubovac-Pilav, Zelmina
author_sort Magnusson, Rasmus
collection PubMed
description BACKGROUND: Transcription factors (TFs) are the upstream regulators that orchestrate gene expression, and therefore a centrepiece in bioinformatics studies. While a core strategy to understand the biological context of genes and proteins includes annotation enrichment analysis, such as Gene Ontology term enrichment, these methods are not well suited for analysing groups of TFs. This is particularly true since such methods do not aim to include downstream processes, and given a set of TFs, the expected top ontologies would revolve around transcription processes. RESULTS: We present the TFTenricher, a Python toolbox that focuses specifically at identifying gene ontology terms, cellular pathways, and diseases that are over-represented among genes downstream of user-defined sets of human TFs. We evaluated the inference of downstream gene targets with respect to false positive annotations, and found an inference based on co-expression to best predict downstream processes. Based on these downstream genes, the TFTenricher uses some of the most common databases for gene functionalities, including GO, KEGG and Reactome, to calculate functional enrichments. By applying the TFTenricher to differential expression of TFs in 21 diseases, we found significant terms associated with disease mechanism, while the gene set enrichment analysis on the same dataset predominantly identified processes related to transcription. CONCLUSIONS AND AVAILABILITY: The TFTenricher package enables users to search for biological context in any set of TFs and their downstream genes. The TFTenricher is available as a Python 3 toolbox at https://github.com/rasma774/Tftenricher, under a GNU GPL license and with minimal dependencies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04357-4.
format Online
Article
Text
id pubmed-8444601
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84446012021-09-17 TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes Magnusson, Rasmus Lubovac-Pilav, Zelmina BMC Bioinformatics Software BACKGROUND: Transcription factors (TFs) are the upstream regulators that orchestrate gene expression, and therefore a centrepiece in bioinformatics studies. While a core strategy to understand the biological context of genes and proteins includes annotation enrichment analysis, such as Gene Ontology term enrichment, these methods are not well suited for analysing groups of TFs. This is particularly true since such methods do not aim to include downstream processes, and given a set of TFs, the expected top ontologies would revolve around transcription processes. RESULTS: We present the TFTenricher, a Python toolbox that focuses specifically at identifying gene ontology terms, cellular pathways, and diseases that are over-represented among genes downstream of user-defined sets of human TFs. We evaluated the inference of downstream gene targets with respect to false positive annotations, and found an inference based on co-expression to best predict downstream processes. Based on these downstream genes, the TFTenricher uses some of the most common databases for gene functionalities, including GO, KEGG and Reactome, to calculate functional enrichments. By applying the TFTenricher to differential expression of TFs in 21 diseases, we found significant terms associated with disease mechanism, while the gene set enrichment analysis on the same dataset predominantly identified processes related to transcription. CONCLUSIONS AND AVAILABILITY: The TFTenricher package enables users to search for biological context in any set of TFs and their downstream genes. The TFTenricher is available as a Python 3 toolbox at https://github.com/rasma774/Tftenricher, under a GNU GPL license and with minimal dependencies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04357-4. BioMed Central 2021-09-16 /pmc/articles/PMC8444601/ /pubmed/34530727 http://dx.doi.org/10.1186/s12859-021-04357-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Magnusson, Rasmus
Lubovac-Pilav, Zelmina
TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title_full TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title_fullStr TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title_full_unstemmed TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title_short TFTenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
title_sort tftenricher: a python toolbox for annotation enrichment analysis of transcription factor target genes
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8444601/
https://www.ncbi.nlm.nih.gov/pubmed/34530727
http://dx.doi.org/10.1186/s12859-021-04357-4
work_keys_str_mv AT magnussonrasmus tftenricherapythontoolboxforannotationenrichmentanalysisoftranscriptionfactortargetgenes
AT lubovacpilavzelmina tftenricherapythontoolboxforannotationenrichmentanalysisoftranscriptionfactortargetgenes