Cargando…

G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs

BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks...

Descripción completa

Detalles Bibliográficos
Autores principales: Di Salvo, Marco, Pinatel, Eva, Talà, Adelfia, Fondi, Marco, Peano, Clelia, Alifano, Pietro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801747/
https://www.ncbi.nlm.nih.gov/pubmed/29409441
http://dx.doi.org/10.1186/s12859-018-2049-x
_version_ 1783298401926905856
author Di Salvo, Marco
Pinatel, Eva
Talà, Adelfia
Fondi, Marco
Peano, Clelia
Alifano, Pietro
author_facet Di Salvo, Marco
Pinatel, Eva
Talà, Adelfia
Fondi, Marco
Peano, Clelia
Alifano, Pietro
author_sort Di Salvo, Marco
collection PubMed
description BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks” for DNA transcription. Computational prediction of promoters in prokaryotes is a problem whose solution is far from being determined in computational genomics. The majority of published bacterial promoter prediction tools are based on a consensus-sequences search and they were designed specifically for vegetative σ(70) promoters and, therefore, not suitable for promoter prediction in bacteria encoding a lot of σ factors, like actinomycetes. RESULTS: In this study we investigated the possibility to identify putative promoters in prokaryotes based on evolutionarily conserved motifs, and focused our attention on GC-rich bacteria in which promoter prediction with conventional, consensus-based algorithms is often not-exhaustive. Here, we introduce G4PromFinder, a novel algorithm that predicts putative promoters based on AT-rich elements and G-quadruplex DNA motifs. We tested its performances by using available genomic and transcriptomic data of the model microorganisms Streptomyces coelicolor A3(2) and Pseudomonas aeruginosa PA14. We compared our results with those obtained by three currently available promoter predicting algorithms: the σ(70)consensus-based PePPER, the σ factors consensus-based bTSSfinder, and PromPredict which is based on double-helix DNA stability. Our results demonstrated that G4PromFinder is more suitable than the three reference tools for both the genomes. In fact our algorithm achieved the higher accuracy (F(1)-scores 0.61 and 0.53 in the two genomes) as compared to the next best tool that is PromPredict (F(1)-scores 0.46 and 0.48). Consensus-based algorithms produced lower performances with the analyzed GC-rich genomes. CONCLUSIONS: Our analysis shows that G4PromFinder is a powerful tool for promoter search in GC-rich bacteria, especially for bacteria coding for a lot of σ factors, such as the model microorganism S. coelicolor A3(2). Moreover consensus-based tools and, in general, tools that are based on specific features of bacterial σ factors seem to be less performing for promoter prediction in these types of bacterial genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2049-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5801747
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58017472018-02-14 G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs Di Salvo, Marco Pinatel, Eva Talà, Adelfia Fondi, Marco Peano, Clelia Alifano, Pietro BMC Bioinformatics Software BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks” for DNA transcription. Computational prediction of promoters in prokaryotes is a problem whose solution is far from being determined in computational genomics. The majority of published bacterial promoter prediction tools are based on a consensus-sequences search and they were designed specifically for vegetative σ(70) promoters and, therefore, not suitable for promoter prediction in bacteria encoding a lot of σ factors, like actinomycetes. RESULTS: In this study we investigated the possibility to identify putative promoters in prokaryotes based on evolutionarily conserved motifs, and focused our attention on GC-rich bacteria in which promoter prediction with conventional, consensus-based algorithms is often not-exhaustive. Here, we introduce G4PromFinder, a novel algorithm that predicts putative promoters based on AT-rich elements and G-quadruplex DNA motifs. We tested its performances by using available genomic and transcriptomic data of the model microorganisms Streptomyces coelicolor A3(2) and Pseudomonas aeruginosa PA14. We compared our results with those obtained by three currently available promoter predicting algorithms: the σ(70)consensus-based PePPER, the σ factors consensus-based bTSSfinder, and PromPredict which is based on double-helix DNA stability. Our results demonstrated that G4PromFinder is more suitable than the three reference tools for both the genomes. In fact our algorithm achieved the higher accuracy (F(1)-scores 0.61 and 0.53 in the two genomes) as compared to the next best tool that is PromPredict (F(1)-scores 0.46 and 0.48). Consensus-based algorithms produced lower performances with the analyzed GC-rich genomes. CONCLUSIONS: Our analysis shows that G4PromFinder is a powerful tool for promoter search in GC-rich bacteria, especially for bacteria coding for a lot of σ factors, such as the model microorganism S. coelicolor A3(2). Moreover consensus-based tools and, in general, tools that are based on specific features of bacterial σ factors seem to be less performing for promoter prediction in these types of bacterial genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2049-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-06 /pmc/articles/PMC5801747/ /pubmed/29409441 http://dx.doi.org/10.1186/s12859-018-2049-x Text en © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Di Salvo, Marco
Pinatel, Eva
Talà, Adelfia
Fondi, Marco
Peano, Clelia
Alifano, Pietro
G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title_full G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title_fullStr G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title_full_unstemmed G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title_short G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
title_sort g4promfinder: an algorithm for predicting transcription promoters in gc-rich bacterial genomes based on at-rich elements and g-quadruplex motifs
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801747/
https://www.ncbi.nlm.nih.gov/pubmed/29409441
http://dx.doi.org/10.1186/s12859-018-2049-x
work_keys_str_mv AT disalvomarco g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs
AT pinateleva g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs
AT talaadelfia g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs
AT fondimarco g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs
AT peanoclelia g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs
AT alifanopietro g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs