Cargando…
G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs
BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801747/ https://www.ncbi.nlm.nih.gov/pubmed/29409441 http://dx.doi.org/10.1186/s12859-018-2049-x |
_version_ | 1783298401926905856 |
---|---|
author | Di Salvo, Marco Pinatel, Eva Talà, Adelfia Fondi, Marco Peano, Clelia Alifano, Pietro |
author_facet | Di Salvo, Marco Pinatel, Eva Talà, Adelfia Fondi, Marco Peano, Clelia Alifano, Pietro |
author_sort | Di Salvo, Marco |
collection | PubMed |
description | BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks” for DNA transcription. Computational prediction of promoters in prokaryotes is a problem whose solution is far from being determined in computational genomics. The majority of published bacterial promoter prediction tools are based on a consensus-sequences search and they were designed specifically for vegetative σ(70) promoters and, therefore, not suitable for promoter prediction in bacteria encoding a lot of σ factors, like actinomycetes. RESULTS: In this study we investigated the possibility to identify putative promoters in prokaryotes based on evolutionarily conserved motifs, and focused our attention on GC-rich bacteria in which promoter prediction with conventional, consensus-based algorithms is often not-exhaustive. Here, we introduce G4PromFinder, a novel algorithm that predicts putative promoters based on AT-rich elements and G-quadruplex DNA motifs. We tested its performances by using available genomic and transcriptomic data of the model microorganisms Streptomyces coelicolor A3(2) and Pseudomonas aeruginosa PA14. We compared our results with those obtained by three currently available promoter predicting algorithms: the σ(70)consensus-based PePPER, the σ factors consensus-based bTSSfinder, and PromPredict which is based on double-helix DNA stability. Our results demonstrated that G4PromFinder is more suitable than the three reference tools for both the genomes. In fact our algorithm achieved the higher accuracy (F(1)-scores 0.61 and 0.53 in the two genomes) as compared to the next best tool that is PromPredict (F(1)-scores 0.46 and 0.48). Consensus-based algorithms produced lower performances with the analyzed GC-rich genomes. CONCLUSIONS: Our analysis shows that G4PromFinder is a powerful tool for promoter search in GC-rich bacteria, especially for bacteria coding for a lot of σ factors, such as the model microorganism S. coelicolor A3(2). Moreover consensus-based tools and, in general, tools that are based on specific features of bacterial σ factors seem to be less performing for promoter prediction in these types of bacterial genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2049-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5801747 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-58017472018-02-14 G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs Di Salvo, Marco Pinatel, Eva Talà, Adelfia Fondi, Marco Peano, Clelia Alifano, Pietro BMC Bioinformatics Software BACKGROUND: Over the last few decades, computational genomics has tremendously contributed to decipher biology from genome sequences and related data. Considerable effort has been devoted to the prediction of transcription promoter and terminator sites that represent the essential “punctuation marks” for DNA transcription. Computational prediction of promoters in prokaryotes is a problem whose solution is far from being determined in computational genomics. The majority of published bacterial promoter prediction tools are based on a consensus-sequences search and they were designed specifically for vegetative σ(70) promoters and, therefore, not suitable for promoter prediction in bacteria encoding a lot of σ factors, like actinomycetes. RESULTS: In this study we investigated the possibility to identify putative promoters in prokaryotes based on evolutionarily conserved motifs, and focused our attention on GC-rich bacteria in which promoter prediction with conventional, consensus-based algorithms is often not-exhaustive. Here, we introduce G4PromFinder, a novel algorithm that predicts putative promoters based on AT-rich elements and G-quadruplex DNA motifs. We tested its performances by using available genomic and transcriptomic data of the model microorganisms Streptomyces coelicolor A3(2) and Pseudomonas aeruginosa PA14. We compared our results with those obtained by three currently available promoter predicting algorithms: the σ(70)consensus-based PePPER, the σ factors consensus-based bTSSfinder, and PromPredict which is based on double-helix DNA stability. Our results demonstrated that G4PromFinder is more suitable than the three reference tools for both the genomes. In fact our algorithm achieved the higher accuracy (F(1)-scores 0.61 and 0.53 in the two genomes) as compared to the next best tool that is PromPredict (F(1)-scores 0.46 and 0.48). Consensus-based algorithms produced lower performances with the analyzed GC-rich genomes. CONCLUSIONS: Our analysis shows that G4PromFinder is a powerful tool for promoter search in GC-rich bacteria, especially for bacteria coding for a lot of σ factors, such as the model microorganism S. coelicolor A3(2). Moreover consensus-based tools and, in general, tools that are based on specific features of bacterial σ factors seem to be less performing for promoter prediction in these types of bacterial genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2049-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-06 /pmc/articles/PMC5801747/ /pubmed/29409441 http://dx.doi.org/10.1186/s12859-018-2049-x Text en © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Di Salvo, Marco Pinatel, Eva Talà, Adelfia Fondi, Marco Peano, Clelia Alifano, Pietro G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title | G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title_full | G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title_fullStr | G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title_full_unstemmed | G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title_short | G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs |
title_sort | g4promfinder: an algorithm for predicting transcription promoters in gc-rich bacterial genomes based on at-rich elements and g-quadruplex motifs |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5801747/ https://www.ncbi.nlm.nih.gov/pubmed/29409441 http://dx.doi.org/10.1186/s12859-018-2049-x |
work_keys_str_mv | AT disalvomarco g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs AT pinateleva g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs AT talaadelfia g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs AT fondimarco g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs AT peanoclelia g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs AT alifanopietro g4promfinderanalgorithmforpredictingtranscriptionpromotersingcrichbacterialgenomesbasedonatrichelementsandgquadruplexmotifs |