Cargando…

TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants

INTRODUCTION: An emerging approach using promoter tiling deletion via genome editing is beginning to become popular in plants. Identifying the precise positions of core motifs within plant gene promoter is of great demand but they are still largely unknown. We previously developed TSPTFBS of 265 Ara...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Huiling, Liu, Lifen, Zhou, Yuying, Deng, Kaixuan, Ge, Yuanxin, Hu, Xuehai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203575/
https://www.ncbi.nlm.nih.gov/pubmed/37229121
http://dx.doi.org/10.3389/fpls.2023.1175837
_version_ 1785045665285406720
author Cheng, Huiling
Liu, Lifen
Zhou, Yuying
Deng, Kaixuan
Ge, Yuanxin
Hu, Xuehai
author_facet Cheng, Huiling
Liu, Lifen
Zhou, Yuying
Deng, Kaixuan
Ge, Yuanxin
Hu, Xuehai
author_sort Cheng, Huiling
collection PubMed
description INTRODUCTION: An emerging approach using promoter tiling deletion via genome editing is beginning to become popular in plants. Identifying the precise positions of core motifs within plant gene promoter is of great demand but they are still largely unknown. We previously developed TSPTFBS of 265 Arabidopsis transcription factor binding sites (TFBSs) prediction models, which now cannot meet the above demand of identifying the core motif. METHODS: Here, we additionally introduced 104 maize and 20 rice TFBS datasets and utilized DenseNet for model construction on a large-scale dataset of a total of 389 plant TFs. More importantly, we combined three biological interpretability methods including DeepLIFT, in-silico tiling deletion, and in-silico mutagenesis to identify the potential core motifs of any given genomic region. RESULTS: For the results, DenseNet not only has achieved greater predictability than baseline methods such as LS-GKM and MEME for above 389 TFs from Arabidopsis, maize and rice, but also has greater performance on trans-species prediction of a total of 15 TFs from other six plant species. A motif analysis based on TF-MoDISco and global importance analysis (GIA) further provide the biological implication of the core motif identified by three interpretability methods. Finally, we developed a pipeline of TSPTFBS 2.0, which integrates 389 DenseNet-based models of TF binding and the above three interpretability methods. DISCUSSION: TSPTFBS 2.0 was implemented as a user-friendly web-server (http://www.hzau-hulab.com/TSPTFBS/), which can support important references for editing targets of any given plant promoters and it has great potentials to provide reliable editing target of genetic screen experiments in plants.
format Online
Article
Text
id pubmed-10203575
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102035752023-05-24 TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants Cheng, Huiling Liu, Lifen Zhou, Yuying Deng, Kaixuan Ge, Yuanxin Hu, Xuehai Front Plant Sci Plant Science INTRODUCTION: An emerging approach using promoter tiling deletion via genome editing is beginning to become popular in plants. Identifying the precise positions of core motifs within plant gene promoter is of great demand but they are still largely unknown. We previously developed TSPTFBS of 265 Arabidopsis transcription factor binding sites (TFBSs) prediction models, which now cannot meet the above demand of identifying the core motif. METHODS: Here, we additionally introduced 104 maize and 20 rice TFBS datasets and utilized DenseNet for model construction on a large-scale dataset of a total of 389 plant TFs. More importantly, we combined three biological interpretability methods including DeepLIFT, in-silico tiling deletion, and in-silico mutagenesis to identify the potential core motifs of any given genomic region. RESULTS: For the results, DenseNet not only has achieved greater predictability than baseline methods such as LS-GKM and MEME for above 389 TFs from Arabidopsis, maize and rice, but also has greater performance on trans-species prediction of a total of 15 TFs from other six plant species. A motif analysis based on TF-MoDISco and global importance analysis (GIA) further provide the biological implication of the core motif identified by three interpretability methods. Finally, we developed a pipeline of TSPTFBS 2.0, which integrates 389 DenseNet-based models of TF binding and the above three interpretability methods. DISCUSSION: TSPTFBS 2.0 was implemented as a user-friendly web-server (http://www.hzau-hulab.com/TSPTFBS/), which can support important references for editing targets of any given plant promoters and it has great potentials to provide reliable editing target of genetic screen experiments in plants. Frontiers Media S.A. 2023-05-09 /pmc/articles/PMC10203575/ /pubmed/37229121 http://dx.doi.org/10.3389/fpls.2023.1175837 Text en Copyright © 2023 Cheng, Liu, Zhou, Deng, Ge and Hu https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Cheng, Huiling
Liu, Lifen
Zhou, Yuying
Deng, Kaixuan
Ge, Yuanxin
Hu, Xuehai
TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title_full TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title_fullStr TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title_full_unstemmed TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title_short TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
title_sort tsptfbs 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10203575/
https://www.ncbi.nlm.nih.gov/pubmed/37229121
http://dx.doi.org/10.3389/fpls.2023.1175837
work_keys_str_mv AT chenghuiling tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants
AT liulifen tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants
AT zhouyuying tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants
AT dengkaixuan tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants
AT geyuanxin tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants
AT huxuehai tsptfbs20transspeciespredictionoftranscriptionfactorbindingsitesandidentificationoftheircoremotifsinplants