Cargando…

PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning

BACKGROUND: Transcription factors (TFs) are proteins controlling the flow of genetic information by regulating cellular gene expression. A better understanding of TFs in a bacterial community context may open novel revenues for exploring gene regulation in ecosystems where bacteria play a key role....

Descripción completa

Detalles Bibliográficos
Autores principales: Oliveira Monteiro, Lummy Maria, Saraiva, João Pedro, Brizola Toscan, Rodolfo, Stadler, Peter F., Silva-Rocha, Rafael, Nunes da Rocha, Ulisses
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822659/
https://www.ncbi.nlm.nih.gov/pubmed/35135629
http://dx.doi.org/10.1186/s40793-021-00394-x
_version_ 1784646641641324544
author Oliveira Monteiro, Lummy Maria
Saraiva, João Pedro
Brizola Toscan, Rodolfo
Stadler, Peter F.
Silva-Rocha, Rafael
Nunes da Rocha, Ulisses
author_facet Oliveira Monteiro, Lummy Maria
Saraiva, João Pedro
Brizola Toscan, Rodolfo
Stadler, Peter F.
Silva-Rocha, Rafael
Nunes da Rocha, Ulisses
author_sort Oliveira Monteiro, Lummy Maria
collection PubMed
description BACKGROUND: Transcription factors (TFs) are proteins controlling the flow of genetic information by regulating cellular gene expression. A better understanding of TFs in a bacterial community context may open novel revenues for exploring gene regulation in ecosystems where bacteria play a key role. Here we describe PredicTF, a platform supporting the prediction and classification of novel bacterial TF in single species and complex microbial communities. PredicTF is based on a deep learning algorithm. RESULTS: To train PredicTF, we created a TF database (BacTFDB) by manually curating a total of 11,961 TF distributed in 99 TF families. Five model organisms were used to test the performance and the accuracy of PredicTF. PredicTF was able to identify 24–62% of the known TFs with an average precision of 88% in our five model organisms. We demonstrated PredicTF using pure cultures and a complex microbial community. In these demonstrations, we used (meta)genomes for TF prediction and (meta)transcriptomes for determining the expression of putative TFs. CONCLUSION: PredicTF demonstrated high accuracy in predicting transcription factors in model organisms. We prepared the pipeline to be easily implemented in studies profiling TFs using (meta)genomes and (meta)transcriptomes. PredicTF is an open-source software available at https://github.com/mdsufz/PredicTF. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40793-021-00394-x.
format Online
Article
Text
id pubmed-8822659
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-88226592022-02-08 PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning Oliveira Monteiro, Lummy Maria Saraiva, João Pedro Brizola Toscan, Rodolfo Stadler, Peter F. Silva-Rocha, Rafael Nunes da Rocha, Ulisses Environ Microbiome Methodology BACKGROUND: Transcription factors (TFs) are proteins controlling the flow of genetic information by regulating cellular gene expression. A better understanding of TFs in a bacterial community context may open novel revenues for exploring gene regulation in ecosystems where bacteria play a key role. Here we describe PredicTF, a platform supporting the prediction and classification of novel bacterial TF in single species and complex microbial communities. PredicTF is based on a deep learning algorithm. RESULTS: To train PredicTF, we created a TF database (BacTFDB) by manually curating a total of 11,961 TF distributed in 99 TF families. Five model organisms were used to test the performance and the accuracy of PredicTF. PredicTF was able to identify 24–62% of the known TFs with an average precision of 88% in our five model organisms. We demonstrated PredicTF using pure cultures and a complex microbial community. In these demonstrations, we used (meta)genomes for TF prediction and (meta)transcriptomes for determining the expression of putative TFs. CONCLUSION: PredicTF demonstrated high accuracy in predicting transcription factors in model organisms. We prepared the pipeline to be easily implemented in studies profiling TFs using (meta)genomes and (meta)transcriptomes. PredicTF is an open-source software available at https://github.com/mdsufz/PredicTF. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40793-021-00394-x. BioMed Central 2022-02-08 /pmc/articles/PMC8822659/ /pubmed/35135629 http://dx.doi.org/10.1186/s40793-021-00394-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Oliveira Monteiro, Lummy Maria
Saraiva, João Pedro
Brizola Toscan, Rodolfo
Stadler, Peter F.
Silva-Rocha, Rafael
Nunes da Rocha, Ulisses
PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title_full PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title_fullStr PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title_full_unstemmed PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title_short PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning
title_sort predictf: prediction of bacterial transcription factors in complex microbial communities using deep learning
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822659/
https://www.ncbi.nlm.nih.gov/pubmed/35135629
http://dx.doi.org/10.1186/s40793-021-00394-x
work_keys_str_mv AT oliveiramonteirolummymaria predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning
AT saraivajoaopedro predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning
AT brizolatoscanrodolfo predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning
AT stadlerpeterf predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning
AT silvarocharafael predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning
AT nunesdarochaulisses predictfpredictionofbacterialtranscriptionfactorsincomplexmicrobialcommunitiesusingdeeplearning