Cargando…

PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning

BACKGROUND: Transcription factors (TFs) are proteins controlling the flow of genetic information by regulating cellular gene expression. A better understanding of TFs in a bacterial community context may open novel revenues for exploring gene regulation in ecosystems where bacteria play a key role....

Descripción completa

Detalles Bibliográficos
Autores principales: Oliveira Monteiro, Lummy Maria, Saraiva, João Pedro, Brizola Toscan, Rodolfo, Stadler, Peter F., Silva-Rocha, Rafael, Nunes da Rocha, Ulisses
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822659/
https://www.ncbi.nlm.nih.gov/pubmed/35135629
http://dx.doi.org/10.1186/s40793-021-00394-x
Descripción
Sumario:BACKGROUND: Transcription factors (TFs) are proteins controlling the flow of genetic information by regulating cellular gene expression. A better understanding of TFs in a bacterial community context may open novel revenues for exploring gene regulation in ecosystems where bacteria play a key role. Here we describe PredicTF, a platform supporting the prediction and classification of novel bacterial TF in single species and complex microbial communities. PredicTF is based on a deep learning algorithm. RESULTS: To train PredicTF, we created a TF database (BacTFDB) by manually curating a total of 11,961 TF distributed in 99 TF families. Five model organisms were used to test the performance and the accuracy of PredicTF. PredicTF was able to identify 24–62% of the known TFs with an average precision of 88% in our five model organisms. We demonstrated PredicTF using pure cultures and a complex microbial community. In these demonstrations, we used (meta)genomes for TF prediction and (meta)transcriptomes for determining the expression of putative TFs. CONCLUSION: PredicTF demonstrated high accuracy in predicting transcription factors in model organisms. We prepared the pipeline to be easily implemented in studies profiling TFs using (meta)genomes and (meta)transcriptomes. PredicTF is an open-source software available at https://github.com/mdsufz/PredicTF. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40793-021-00394-x.